Validity has been largely tested in the literature by convergent validity – a correlation between an instrument and a second ‘validated’ instrument. This is necessary but not sufficient for a satisfactory instrument (even if the second instrument is truly valid). An instrument with poor construct validity may simultaneously correlate highly with another (criterion) instrument. Two cases are illustrated below.

Figure 2 Insensitivity/Content Invalidity



Insensitivity: In the first case, insensitivity (strictly a subset of content invalidity) may occur when there are insufficient items in an instrument or too few categories to fully capture a dimension affect. In Figure 2 true utility (True U) on the vertical axis will be recorded by instrument Z on the horizontal axis as 0.0 until it reaches a ‘switch point’ of ‘a’ when the average recorded response becomes ‘al’. True utility must rise to ‘b’ before recorded utility switches to bl, etc. As a result, a program which increases true utility by an amount ‘A’ will record no change on instrument Z. Conversely, a program increasing utility by a smaller amount, B, will result in a large recorded increase in utility from ‘bl’ to ‘cl’. Importantly, a validation study measuring both true and recorded utility (points along the step function) would produce a high correlation.


Figure 3 Construct validity


Content invalidity: Figure 3 again depicts real utility on the vertical and measured utility on the horizontal axes. Points shown would result in a high positive correlation between them. However the omission of items or dimensions from instrument Z may result in a cluster of points, A, where the dimensions of instrument Z improve but the effect is more than offset by the negative effect from the omitted items/dimensions.

To illustrate these figures consider a mobility instrument consisting of three questions.

(i)    I have no pain

(ii)    I have mild pain

(iii)    I have extreme pain

This instrument would produce a significant correlation with a validated pain instrument as the two responses would each attract the patients closest to it. Despite the high correlation the item clearly lacks sensitivity to gradations and types of pain (content invalidity).

The instrument could be ‘validated’ against a more detailed instrument such as the McGill Pain Questionnaire; that is, it would correlate highly. Suppose, however, this instrument was used to measure the QoL of two people, one of whom was receiving pain reducing medication with side effects omitted for the overall instrument (eg loss of vitality). As in Figure 2, a measured increase in utility in the mid-range attributable to the medication might correspond with a decrease in true utility due to a loss of vitality. The ‘validated’ instrument, however, would produce results suggesting the opposite conclusion as a result of its content invalidity in the context of this evaluation.