Kitty Cardwell, Geoffrey Dennis, Andrew Flannery , Jacqueline Fletcher, Doug Luster, Mark Nakhla, Anna Rice, Pat Shiel, James Stack, Colin Walsh, and Laurene Levy (in memoriam)
Cardwell, K. 2018. Diagnostic Assay Validation Terminology. The Plant Health Instructor. DOI: 10.1094/PHI-I-2018-0709-01
Validation of diagnostic assays refers to metrics and definitions that help frame the performance characteristics of the assay. Validation metrics are designed to understand how reliable an assay is under various conditions. Decisions about some of the validation metrics in an assay may be different for different purposes, depending upon factors such as the consequences for a false positive or negative test result. However, the research needed to fully validate an assay can take time, so there may be events where diagnosticians have to use an assay before all of the performance metrics are fully known. Diagnostic assay developers whether in the private sector or in a University lab, must be familiar with the terminology and statistics of diagnostic assay validation. Plant disease diagnosticians may find the following definitions helpful when describing their confidence in the outcome of an assay. The following glossary of terminology has been adapted from multiple sources for Plant Pathologists, and is considered a work in progress.
- ACCREDITED LABORATORY: Laboratory that has been verified by a third-party entity to conform to a specific standard.
ACCURACY: Assessment of nearness of a test value to the expected value. The expected value may be obtained from a known reference standard (plant pest, pathogen, or biomolecule associated with either), reagent of known activity, or well-documented titer. This term may be used in other fields and regions to represent both trueness (ICH Q2, 2005) and bias (GUM, 2008), or is an umbrella term broken down into specific categories of trueness and bias to evaluate systematic error (VIM, 2007).
ACTION LEVEL: Level of concern for an analyte that must be reliably identified or quantified in a sample.
ANALYTE: The specific organism or molecule being detected and/or measured in a test sample.
ANALYTICAL BATCH: Samples that are analyzed within the same time period with the same method sequence and lots of reagents.
ASSAY: Step in a protocol, such as a diagnostic assay, when a sample material is subjected to a test that provides of a measurement or binary response (detected or not detected). This is mutually exclusive to steps in the protocol prior to the treatment, such as obtaining, extracting and purifying sample material. Examples include enzyme immunoassay, complement fixation test, dipsticks, chromatography, or polymerase chain reaction.
BIAS: Dispersal of error in multiple measurements between assay results and an accepted reference value. Note: Bias is systematic error as contrasted to random error and there may be multiple components contributing to systematic error. Because bias is an assessment of systematic error, it is evaluated as a component of the diagnostic test (VIM, 2007).
BIOSAFETY: Processes and procedures that protect the diagnostician or researcher from harmful exposure to a biological organism or chemical agent.
BIOCONTAINMENT: Infrastructure, processes and procedures to reduce risk of accidental escape or intentional release of a potentially high consequence biological organism or agent.
BIOSECURITY: Infrastructure, processes and procedures to reduce the risk of accidental, natural or intentional introduction of infectious or invasive organisms harmful to human, animal and plant life.
CALIBRATION: A set of operations that establish and verify optimal functional standard values of a measuring instrument or system.
CERTIFIED DIAGNOSTICIAN: A laboratory clinician who has received a specific training or a successful performance assessment from an independent entity. The Certification can be used to authorize use of a test or diagnostic activity.
CHAIN OF CUSTODY: A protocol that documents the movement and handling of samples to be tested to ensure integrity and reliability for specific regulatory or legal evidentiary purposes.
COMPARABILITY: Term indicating that performance characteristics of an existing validated assay or protocol will produce a similar result when a minor change is introduced. Similarity is established within statistically defined limits (ICH Q5, 2004).
CONFIDENCE Interval: Probability that the true mean is located within a range of error of measurements (GUM, 2008).
CONFIDENCE Level: Probability that the true mean will consistently be located within the confidence interval over time (GUM, 2008).
CONFIRMATORY METHOD: A method that provides the highest confidence for confirmation of the identity and possibly quantity of the analyte. Confirmatory methods may be based on a combination of techniques such as ELISA and/or PCR and taxonomic identification.
Ct VALUE: Threshold cycle, a term similar to crossing point and take-off point put forward by real-time instrument manufacturers to distinguish their product from competitors (Bustin, et al., 2009).
Cq VALUE: Quantitative cycle, a specific point chosen in the PCR cycle used to either determine a quantitative measurement or distinguish a detection qualitatively from background (Bustin, et al., 2009).
DIAGNOSTIC ASSAY: An assay intended to provide results for a specific diagnosis of an organism, disease or condition. The diagnostic assay is a component of the diagnostic protocol.
- DIAGNOSTIC PROTOCOL: Overall activities required to make and complete diagnosis; including sampling plans, sample handling, and the diagnostic assay(s). These activities often culminate in a diagnostic determination.
- EFFICACY: Specific ability of a diagnostic assay or protocol to produce the result for which it is offered when used under the conditions recommended by the manufacturer (the manufacturer can be a commercial or other entity making the product). Also known as a defined, measurable and reproducible effect produced by a treatment to prevent, remove, or control the organism or analyte (ISPM 5, 2017).
EQUIVALENCY TESTING: Determination of certain assay performance characteristics of new or different test methods by means of an inter-laboratory comparison to a standard test method; implied in this definition is that participating laboratories are using their own test methods, reagents and controls and that results are expressed qualitatively. (Associated with COMPARABILITY TESTING).
FALSE NEGATIVE: A negative reaction of a test sample obtained from a plant known to be exposed to or infected with the target organism. This may be due to lack of analytical sensitivity, restricted analytical specificity or analyte degradation. False negatives lead to a decrease in diagnostic sensitivity.
FALSE POSITIVE: A positive reaction of a test that is not attributable to exposure to or infection with the target organism. This may be due to cross-reactivity, cross-contamination of the test sample or non-specific reactions and matrix effects. False positives lead to a decrease in diagnostic specificity.
FITNESS FOR PURPOSE: Translating stated need or purpose into analytical requirement.
HARMONIZATION: The result of an agreement between laboratories to calibrate similar test methods, adjust diagnostic thresholds and express test data in such a manner as to allow uniform interpretation of results between laboratories.
INTERMEDIATE PRECISION: Level of agreement between replicates of the same sample in similar conditions by the same lab (ICH Q2, 2005). For example, the sample is tested by analyst A and analyst B, or tested on instrument ABC and DEF, or tested using reagent lots UVW and XYZ on different days, in any combination (VIM, 2007).
LEVELS OF VALIDATION: In many cases, complete quantification of all performance characteristics of an assay is impractical or not required to demonstrate fitness, depending on the purpose of the assay. Many of the U.S. human clinical, food, and environmental laboratory testing networks classify specific characteristic levels to match the validation requirements to their specific required purpose. These classifications are described as Tier 1 through Tier 4, where in Tier 1 repeatability, and early statistical evaluation of analytical specificity/sensitivity are made within one laboratory, for use by that laboratory. Tier 2 includes Tier 1 plus an analysis of the assay by two labs, thus generating more performance statistics and the diagnostic specificity/sensitivity and limit of detection are defined. Tier 3 includes Tiers 1 and 2, and involves multiple labs and people working with the assay to determine reproducibility of results and robustness of the assay to variable conditions. A completely characterized assay for broad and general use is classified as Tier 4.
LIMIT OF DETECTION (LOD): Estimated lowest amount of an analyte in a specified matrix that can be reliably distinguished from absence of the analyte (a blank). Positive results areexpressed as a percent and the probability of detection (VIM, 2007).
LINEARITY: Ability within a given range to obtain test results which are directly proportional to the concentration (amount) of analyte in the sample (ICH Q2, 2005).
MATRIX: All the constituents that collectively form a test sample (ISO 16577, 2016).
MATRIX BLANK: A quality control sample consisting of a specific matrix that does not contain the organism, or biomolecule, of interest, and does not contain matrix components that might interfere with the assay (IUPAC, 2002).
MATRIX SPIKE: An aliquot of a sample prepared by adding a known quantity of analyte to a specified matrix. Spiked samples are often used to establish or support accuracy, linearity and limit of detection.
MELTING POINT: The transition of a substance from its normal property at room temperature. In PCR, it refers to the temperature at which double-stranded DNA (50%) separates to form two single strands.
NATIONAL PLANT DIAGNOSTIC NETWORK (NPDN): A U.S. national network of diagnostic laboratories that rapidly and accurately detect and report pathogens that cause plant diseases of national interest, particularly those that could be deemed to be a biosecurity risk. The mission of NPDN is accomplished through an effective communication network of regional expertise that uses harmonized reporting protocols to update a national database of pest and disease occurrences. https://www.npdn.org
NATIONAL CLEAN PLANT NETWORK (NCPN): Created to protect U.S. specialty crops such as grapes, fruit trees, citrus, berries, roses and sweet potatoes from the spread of economically harmful plant pests and diseases, the NCPN ensures the global competitiveness of U.S. specialty crops by creating high standards for clean germplasm. http://nationalcleanplantnetwork.org
NATIONAL SEED HEALTH SYSTEM (NSHS): A program authorized by USDA-APHIS and administered by the Iowa State University Seed Science Center to accredit both private and public entities to perform activities needed to support the issuance of Federal phytosanitary certificates for the international movement of seed. Through the NSHS, new seed health testing methods are incorporated into the accreditation program to maintain the program on the cutting edge of technology. www.seedhealth.org.
PREDICTIVE VALUE OF A POSITIVE TEST RESULT: # True positive/(True positive + False positive) x 100;
PREDICTIVE VALUE OF A NEGATIVE TEST RESULT: # True negative/(True negative + False negative) x 100; Note: positive and negative predictive values are influenced by the prevalence of disease in the population. At low disease prevalence the chance for false positive is higher, whereas for high disease prevalence, the chance for a true positive is higher.
PERFORMANCE CHARACTERISTIC: An attribute of a test method critical to defining its performance. Characteristics can be analytical sensitivity and specificity, accuracy and precision, diagnostic sensitivity and specificity and/or repeatability and reproducibility.
PRECISION: The degree of dispersion (such as variance, standard deviation or coefficient of variation) within a series of measurements of the same sample tested under specified conditions (ICH Q2, 2005). Low variance between test measurements indicates higher precision.
PRESUMPTIVE POSITIVE: A sample that has tested positive with a screening method, but requires additional testing to complete the diagnostic determination. For regulatory purposes, presumptive positive samples require additional test results to authorize regulatory actions.
PROFICIENCY TESTING: Measure of laboratory competence using inter-laboratory comparisons against pre-established criteria; implied in this definition is that participating laboratories are using the same test methods, reagents and controls and results are expressed qualitatively (ISO/IEC 17043, 2010).
QUALITATIVE METHOD: Method that detects presence of an analyte based on chemical, biological, or physical properties. Some qualitative methods can be made to be “semi-quantitative” to provide rough estimates of amount present.
QUANTITATIVE METHOD: Method that produces a specific measured quantity of an analyte, such as copies per reaction or colony forming units per milliliter. Quantitative methods must have a well characterized range supported by robust statistical analyses that brackets the action limit.
QUANTITATION LIMIT: Lowest amount of analyte in a sample which can be quantitatively determined with suitable probability of precision and accuracy (ICH Q2, 2005). The quantitation limit is critical for measuring analytes in complex sample matrices, determining pass or fail decisions, and establishing rejection criteria for a test. Quantitative assays are commonly used for measuring analytes, impurities and/or degradation products (excipients).
RANDOM ERROR: Irreproducibility in replicate measurements resulting from random changes in experimental conditions. Random error is the dispersal or variance of measurements collectively evaluated to characterize the precision of an assay. This is in contrast to systematic errors derived from innate error in the system, such as the measurement uncertainty of pipettes, balances, and cycling/thermoblock temperatures (GUM, 2008).
RANGE: The range of an analytical procedure is the interval between the upper and lower concentration (amounts) of analyte in the sample for which it has been demonstrated that the analytical procedure has a suitable level of precision, accuracy and linearity (ICH Q2, 2005).
REFERENCE LABORATORY: Laboratory of recognized scientific and diagnostic expertise for a particular plant disease or testing methodology; includes capability for characterizing and assigning values to reference reagents and samples.
REFERENCE MATERIAL: Reagent validated to define its homogeneity, stability, and reproducibility to standardize or calibrate tests or equipment (VIM, 2007). Reference material should first be sought through an International Reference Laboratory, but can also be obtained from a laboratory that can demonstrate the material has been validated and determined fit for its intended use.
REFERENCE STANDARD: Traceable item with a defined measurement uncertainty such as weights used to calibrate a weigh balance. Reference standards are metrologically traceable to a source such as NIST (VIM, 2007).
REFERENCE STRAIN: a. A verified pure culture of a target organism (preferably housed in an established and recognized culture collection); b. Purified nucleic acids from a verified target organism or verified infected host for positive control in nucleic acid-based assays; c. Purified or expressed proteins from a verified target organism or infected host for positive control in immunoassays; and d. For Forensics, genomic and/or SNP information from a worldwide collection of isolates or strains of the target organism. (b. and c. for when the organism is either unculturable or difficult to maintain in culture).
REPEATABILITY: Level of agreement between replicates of the same sample in the same exact conditions by the same operator, equipment and reagents. For example, the test repeated by analyst A on instrument ABC using reagent lot XYZ on the same day (VIM, 2007).
REPRODUCIBILITY: Ability of a test method to provide consistent results for the same sample tested by the same method in different laboratories (VIM, 2007).
RING TEST: Evaluation of assay performance or reagent integrity by two or more laboratories. One laboratory may act as the reference in defining test sample attributes, or a consensus of performance may be determined.
ROBUSTNESS: Assessment of an assay or material to produce expected results when subjected to testing outside its verified range of use (ICH Q2, 2005). Changes in variables such as temperature, humidity, and stability can be observed to verify whether the assay or material will maintain its validated characteristics when mishandling occurs or if there is a moderate risk that test conditions cannot be adequately controlled.
SCREENING METHOD: Method intended to identify, detect or measure the presence or absence of an organism in a sample as a preliminary assessment. Screening is also referred to as triage, to eliminate negative samples and move presumptive actionable samples forward for confirmation by a certified authority.
SELECTIVITY: The capability to discriminate between the organism of interest and other organisms and components of the sample, such as host tissue. In binary analysis, selectivity is the equivalent of global accuracy taking into account all false reactions, both positive and negative.
SENSITIVITY (ANALYTICAL): Synonymous with “Limit of Detection”; smallest detectable amount of analyte that can be measured with a defined certainty; analyte may include antibodies, antigens, nucleic acids or live organisms.
SENSITIVITY (DIAGNOSTIC): Proportion of known infected reference samples that test positive in the assay; infected plants that test negative are considered to have false-negative results.
SENSITIVITY (RELATIVE): Proportion of reference samples defined as positive by one or a combination of test methods that also test positive in the assay being compared.
SPECIFICITY (GENERAL): A measure of the host range of a pathogen ranging from an extreme specialist for a single species or strain of its host to a generalist with many hosts ranging over several groups of organisms (ISPM 5, 2017).
SPECIFICITY (ANALYTICAL): Analytical specificity refers to the ability of an assay to identify a specific organism or analyte, rather than any other. The higher the analytical specificity, the lower the level of false positives.
SPECIFICITY (DIAGNOSTIC): Proportion of known uninfected reference plants that test negative in the assay; uninfected reference plants that test positive are considered to have false-positive results.
STRINGENCY: Applying rigorous standards of performance and validity, such as: stringent laboratory controls; also, conscientious attention to rules and details.
SYSTEMATIC ERROR: Measurement error innate in an instrument, reagent or testing a specific well-characterized sample matrix. Systematic errors collectively contribute to the minimum dispersion of measurements possible before proceeding to test the sample (GUM, 2008).
THRESHOLD: Measured value distinguishing between negative and positive test results. The threshold may be adjusted to reduce the 5% uncertainty inherent in a system, under the assumption or evidence collected that sample results follow the standard distribution curve as concentration decreases.
TRUENESS: Degree of agreement of the expected value with the true or reference value over a series of evaluations (GUM, 2008).
VALIDATION: Process that determines the fitness for purpose of an assay, which has been properly developed, optimized and standardized, for an intended use.
VERIFICATION: Confirmation by examination of objective evidence that specified requirements have been fulfilled.
Bustin, S. A., Benes, V., Garson, J. A., Hellemans, J., Huggett, J., Kubista, M.,...Wittwer, C. T. (2009). The MIQE Guidelines - Minimum Information for Publication of Quantitative Real-Time PCR Experiments. Clinical Chemistry, 55(4), 611-622.
GUM. (2008). ISO/IEC Guide 98-3:2008 Uncertainty of measurement -- Part 3: Guide to the expression of uncertainty in measurement. International Organization of Standards.
ICH Q2. (2005). Validation of Analytical Procedures: Text and Methodology. ICH Harmonised Tripartitie Guideline. International Conference on Harmonization.
ICH Q5. (2004). Comparability of Biotechnological/Biological Products Subject to Changes in their Manufacturing Process. ICH Harmonised Tripartitie Guideline. International Conference on Harmonization.
ISO 16577. (2016). ISO 16577:2016 Molecular biomarker analysis -- Terms and definitions. International Organization of Standards.
ISO/IEC 17043. (2010). ISO/IEC 17043:2010 Conformity assessment -- General requirements for proficiency testing. International Organization of Standards.