If evaluators tend to disagree, but in the absence of a consistent model where one rating is higher than the other, the average is close to zero. Confidence limits (usually 95%) can be calculated both for distortion and for each of the compliance limits. Krippendorffs alpha is a versatile statistic that evaluates the concordance between observers who classify, evaluate, or measure a certain amount of objects relative to the values of a variable. It generalizes several specialized conformity coefficients by accepting any number of observers, applicable to nominal, ordinal, interval and proportional levels, capable of processing missing data and being corrected for small sample sizes. There are several operational definitions of «inter-board reliability» that reflect different views on what a reliable agreement between evaluators is.  There are three operational definitions of the agreement: there are several formulas that can be used to calculate compliance limits. The simple formula, which is given in the previous paragraph and works well for sample sizes greater than 60, is another approach to convergence (useful if there are only two evaluators and the scale is continuous) is to calculate the differences between each pair of observations of the two evaluators. The mean value of these differences is called «bias» and the reference interval (mean value ± 1.96 × standard deviation) is called the conformity limit. The limitations of the agreement make it possible to determine the extent to which random variations can influence evaluations. There are a number of statistics that can be used to determine reliability between evaluators. Different statistics are adapted to different types of measures. Some options are the common probability of an agreement, cohens Kappa, Scotts Pi and the related fleiss-Kappa, inter-rater correlation, concordance correlation coefficient, intraclass correlation and Krippendorffs Alpha.
. . .