Link to home

Some Consequences of Using the Horsfall-Barratt Scale for Hypothesis Testing

October 2010 , Volume 100 , Number  10
Pages  1,030 - 1,041

C. H. Bock, T. R. Gottwald, P. E. Parker, F. Ferrandino, S. Welham, F. van den Bosch, and S. Parnell

First author: United States Department of Agriculture (USDA) Agricultural Research Service (ARS)--SEFTNRL, 21 Dunbar Rd., Byron, GA 31008; second author: USDA ARS--USHRL, 2001 S. Rock Rd., Ft. Pierce, FL 34945; third author: USDA Animal and Plant Health Inspection Service--PPQ, Moore Air Base, Edinburg, TX 78539; fourth author: Department of Plant Pathology and Ecology, Connecticut Agricultural Experiment Station, New Haven 06511; and fifth, sixth, and seventh authors: Rothamsted Research, Harpenden, Herts., AL5 2JQ, England, UK.

Go to article:
Accepted for publication 4 June 2010.

Comparing treatment effects by hypothesis testing is a common practice in plant pathology. Nearest percent estimates (NPEs) of disease severity were compared with Horsfall-Barratt (H-B) scale data to explore whether there was an effect of assessment method on hypothesis testing. A simulation model based on field-collected data using leaves with disease severity of 0 to 60% was used; the relationship between NPEs and actual severity was linear, a hyperbolic function described the relationship between the standard deviation of the rater mean NPE and actual disease, and a lognormal distribution was assumed to describe the frequency of NPEs of specific actual disease severities by raters. Results of the simulation showed standard deviations of mean NPEs were consistently similar to the original rater standard deviation from the field-collected data; however, the standard deviations of the H-B scale data deviated from that of the original rater standard deviation, particularly at 20 to 50% severity, over which H-B scale grade intervals are widest; thus, it is over this range that differences in hypothesis testing are most likely to occur. To explore this, two normally distributed, hypothetical severity populations were compared using a t test with NPEs and H-B midpoint data. NPE data had a higher probability to reject the null hypothesis (H0) when H0 was false but greater sample size increased the probability to reject H0 for both methods, with the H-B scale data requiring up to a 50% greater sample size to attain the same probability to reject the H0 as NPEs when H0 was false. The increase in sample size resolves the increased sample variance caused by inaccurate individual estimates due to H-B scale midpoint scaling. As expected, various population characteristics influenced the probability to reject H0, including the difference between the two severity distribution means, their variability, and the ability of the raters. Inaccurate raters showed a similar probability to reject H0 when H0 was false using either assessment method but average and accurate raters had a greater probability to reject H0 when H0 was false using NPEs compared with H-B scale data. Accurate raters had, on average, better resolving power for estimating disease compared with that offered by the H-B scale and, therefore, the resulting sample variability was more representative of the population when sample size was limiting. Thus, there are various circumstances under which H-B scale data has a greater risk of failing to reject H0 when H0 is false (a type II error) compared with NPEs.

The American Phytopathological Society, 2010