APSnet Education Center Advanced Topics | Return to the Topics Index | Ecology and Epidemiology in R



Sparks, A.H., P.D. Esker, M. Bates, W. Dall' Acqua, Z. Guo, V. Segovia, S.D. Silwal, S. Tolos, and K.A. Garrett, 2008. Ecology and Epidemiology in R: Disease Progress over Time. The Plant Health Instructor. DOI:10.1094/PHI-A-2008-0129-02.

Ecology and Epidemiology in R: Disease Progress over Time

Calculating the area under the disease progress curve to quantify disease progress

The area under the disease progress curve (AUDPC) is a useful quantitative summary of disease intensity over time, for comparison across years, locations, or management tactics. The most commonly used method for estimating the AUDPC, the trapezoidal method, is to discretize the time variable (hours, days, weeks, months, or years) and calculate the average disease intensity between each pair of adjacent time points (Madden et al. 2007). We can consider the sample time points in a sequence {ti}, where the time interval between two time points may be consistent or may vary, and we also have associated measures of the disease level {yi}. We define y(0) = y0 as the initial infection or the disease level at t = 0 (i.e., the first disease severity observation in our study). A(tk), the AUDPC at t = tk, is the total accumulated disease until t = tk, given by . To illustrate this concept, the exercise below shows the rectangular area Si as the increase in accumulated disease during the time interval from ti-1 to ti, S1 = A1 and Ai = Ai-1 + Si, i ≥ 2.

For the purposes of this exercise we will write our own AUDPC function. The Agricolae package for R also includes an AUDPC function. Below is an illustration of the function to calculate the AUDPC and graph the output.

#Set up disease severity percent measurements; 
#change these in subsequent analyses to see how it affects
#the AUDPC
ds0<-1
ds1<-2
ds2<-7
ds3<-7.5

#Put these values into a vector without making any changes
disease.severity<-c(ds0,ds1,ds2,ds3)

#Time points at which disease severity measurements are made, 
#change these in subsequent analyses to
#see how it affects the AUDPC Value
t0<-0
t1<-2
t2<-5
t3<-6

#Put time period into a vector
## Do not change these values
time.period<-c(t0,t1,t2,t3)

#Refresh your memory about how the plot function works
help(plot)

#Create the plot of disease severity over time
plot(time.period,
  disease.severity,
  ylim=c(0,(ds3+1)),
  xlim=c(0,(t3+0.5)),
  xlab="Time",
  ylab="Disease Severity (%)",
  type="o",
  pch=19,
  col="mediumblue")

#Add a title and subtitle to our plot
title(main="Illustration of AUDPC Calculation",sub="Figure 1")

#Add text to x labels defining time periods defined in text
mtext("=t0",1,at=0.3,1)
mtext("=t1",1,at=2.3,1)
mtext("=t2",1,at=5.3,1)
mtext("=t3",1,at=6.3,1)

#Illustrate the area under disease progress curve with rectangles.
## Do not change these values
rect(t0,0,t1,((ds0+ds1)/2),border="orange")
# Add text to rectangle to describe rectangle
text(1,1,"A1")
#Add segment to Y axis
#And so-on
rect(t1,0,t2,((ds1+ds2)/2),border="orange")
text(((t1+t2)/2),(((ds1+ds2)/2)/2),"S2")
#Draw line to axis and label with value
segments(.4,((ds1+ds2)/2),t2,((ds1+ds2)/2),col="black",lty="18")
text(0,((ds1+ds2)/2),((ds1+ds2)/2))
rect(t2,0,t3,((ds2+ds3)/2),border="orange")
text(((t2+t3)/2),(((ds2+ds3)/2)/2),"S3")
segments(0.4,((ds2+ds3)/2),t2,((ds2+ds3)/2),col="black",lty="18")
text(0,((ds2+ds3)/2),((ds2+ds3)/2))

#Build a function for AUDPC calculation
#the left curly bracket indicates the beginning of the function
  audpc <- function(disease.severity,time.period){

        #n is the length of time.period, or the total number of sample dates
                  n <- length(time.period)

        #meanvec is the vector (matrix with one dimension)
        #that will contain the mean percent infection
        #it is initialized containing -1 for all entries
        #this sort of initialization is sometimes useful for debugging
                  meanvec <- matrix(-1,(n-1))

        #intvec is the vector that will contain the length of time between
        #sampling dates
                  intvec <- matrix(-1,(n-1))

        #the loop goes from the first to the penultimate entry
        #the left curly bracket indicates the beginning of commands in the loop
                  for(i in 1:(n-1)){

        #the ith entry in meanvec is replaced with the mean percent infection
        #between sample time i and sample time i+1
                  meanvec[i] <-  mean(c(disease.severity[i],disease.severity[i+1]))

        #the ith entry in intvec is replaced with the length of the time
        #interval between time i and time i+1
                  intvec[i] <- time.period[i+1] - time.period[i]

        #the right curly bracket ends the loop
                  }

        #the two vectors are multiplied together one entry at a time
        infprod <- meanvec * intvec

        #the sum of the entries in the resulting vector gives the AUDPC
        sum(infprod)

#the right curly bracket ends the function
}


#Now apply the function to the example data and put the result
#in a new object called 'AUDPCexample'
audpc(disease.severity,time.period) -> AUDPCexample
#Display AUDPC Value
#Draw rectangle around value
rect(0.1,(ds3+.3),2,(ds3+1),border="black")
#AUDPC Text
text(1.05,(ds3+0.8),"AUDPC")
text(1.05,(ds3+0.5),AUDPCexample)

Output

AUDPC Illustration

Click on the image for larger version.

You can use the R code above to illustrate other examples by replacing the values of ds0 through ds4 and t0 through t4, as commented in the R script above. The function for calculating the AUDPC, audpc, can be used with other data sets for which the number of disease observations is equal to the number of time points and for which the time points appear in order.

Since you might have your own disease progress data, this would be a good time to revisit how to use the read.csv function. Refer to An Introduction to the R Programming Environment for a brief tutorial on how to import data from spreadsheets and other sources (Garrett et al. 2007).