Become an
Expert in Spirometry

Statistical features of laboratory tests

Who is ‘sick'? The GOLD group regards an FEV1%VC < 70% as indicative of airway obstruction, irrespective of respiratory symptoms. Hence let us regard anyone with an FEV1% FVC < 70% as sick. One can then compare this index to the lower limit of normal (LLN) according to various authors and identify subjects who are 'sick' but whose FEV1%FVC is above the LLN (false negative) or below the LLN (true positive). Similarly, if an individual is 'not sick' according to the GOLD criterion, he/she is classified as false negative if FEV1%FVC is below, and true negative if it is above the LLN.

Based on these classifications one can then derive the sensitivity, specificity and the predictive value of a positive or negative test result. We have made software that is available for free, and that allows you to apply this analysis to a number of populations of men and women:  

The efficacy of two tests designed to separate healthy from sick individuals follows from a comparison how each test classifies subjects, as follows:

  GOLD Criterion  
LLN criterion Sick Not sick Total
Sick TP FN TP+FN
Not sick FP TN FP+TN
Total

TP+FP FN+TN TP+FP+FN+TN

where   TP = true positive
             FP = false positive
             FN = false negative
             TN = true negative

Using the GOLD criterion, and the criterion whether FEV1%FVC is below the LLN, this boils down to:

False positive (FEV1%FVC < 70%) but (FEV1%FVC >= LLN)
False negative (FEV1%FVC >= 70%) but (FEV1%FVC < LLN)
True positive (FEV1%FVC < 70%) but (FEV1%FVC < LLN)
True negative (FEV1%FVC >= 70%) but (FEV1%FVC >= LLN)
Prevalence GOLD (FEV1%FVC < 70%) as a % of the total population
Prevalence LLN (FEV1%FVC < LLN) as a % of the total population
Efficiency (True -ve + True +ve) as a % of the total population

Before we continue one should realize that there is one big catch in comparing the GOLD criterion to any other criterion. COPD is defined in terms of clinical symptoms and natural history of the disease, and in terms pathological features. However, the GOLD criterion is not based on any of these: according to the GOLD guidelines COPD is a test finding, rather than the test finding a means of corroborating or discarding the diagnosis of COPD. Although it is very likely that the GOLD guideline will lead to identifying a significant proportion of people with COPD, it is equally likely to incorrectly identify subjects as having COPD (false positives), as well as missing people who do have COPD (false negatives) [ref. 5-10 ]. This is true of any test, so that the above also applies if ones uses the LLN criterion.
Since it is generally agreed that COPD has a high prevalence in the general population, we might expect reasonably high predictive values for both negative and positive test results. However, keep in mind that these findings are likely to be much influenced by the age range of the population studied. Based on a long list of publications on the predicted lower limit of normal for FEV1%FVC we may expect a relatively high proportion of false positive test results applying the GOLD criterion the older the population studied.

Briefly, therefore, the validation of a test requires knowledge of the true diagnosis in all cases, which is clearly not available in the case of COPD. In such circumstances the results of a new test are commonly compared to those of a well-established test with its own specificity and sensitivity. However, any agreement between the two tests should ideally now be described in terms of co-positivity (the probability that the test is correctly interpreted as positive given that the reference diagnosis is positive) and co-negativity (the probability that the test is correctly interpreted as negative given that the reference diagnosis is negative) [4].

Please note, therefore, that these indices only describe the agreement between two test results: defining a subject as having airway obstruction either on the basis of the LLN or on the basis of the GOLD criterion. This leaves unanswered the question who is really sick and who is not.

The figure below shows the findings in one of the general population samples, as produced by our software (available for free). As expected, with the GOLD fixed cut-off, false negative findings occur in the youngest age group, and false positive ones at an older age. Please note that in the ‘NL Healthy', ‘AUS Healthy’ and UK Healthy populations, the FEV1%FVC is below the predicted LLN (prevalence LLN) in approximately 5% of healthy subjects.
Display of observations and statistics
The software displays, as an insert in a larger graph, a so-called ROC (Receiver Operating Characteristic) curve. This is a graph which displays the relationship between the true positive fraction (TPF, the ‘hits’) and the false positive fraction (FPF = 1 - specificity, the ‘misses’) of the population as one varies the LLN from a very low to a very high value. If we denote the fraction of false negatives as FNF, and of true negatives as TNF, it follows that FNF + TPF = 1, and FPF +TNF = 1. Therefore all basic information about the efficacy of a test is contained in the ROC curve.

You can tweak the LLN of FEV1%FVC by moving the pointer in the trackbar (or click on the trackbar and then move the pointer by using the arrow keys). We’ll see later why that may be useful.

References
1 Borsboom, GJJM van Pelt, W, van Houwelingen, H C, van Vianen, BG, Schouten, J P, and Quanjer, PH. Diurnal variation in lung function in subgroups from two Dutch populations. Consequences for longitudinal analysis. Am J Respir Crit Care Med 1999; 159: 1163-1171. PubMed
2 Gore CJ, Crockett AJ, Pederson DG, Booth ML, Bauman A, Owen N. Spirometric standards for healthy adult lifetime nonsmokers in Australia. Eur Respir J 1995; 8: 773-782. PubMed
3 Falaschetti E, Laiho J, Primatesta P, Purdon S: Prediction equations for normal and low lung function from the Health Survey for England. Eur Respir J 2004; 23: 456-463. PubMed
4 Buck AA, Gart JJ. Comparison of a screening test and a reference test in epidemiologic studies. I. Indices of agreement and their relation to prevalence. Am J Epid 1966; 83: 586-592. PubMed
5 Hardie JA, Buist AS, Vollmer WM, Ellingsen I, Bakke PS, Mørkve O. Risk of over-diagnosis of COPD in asymptomatic elderly never smokers. Eur Respir J 2002; 20: 1117-1122.
6 Hnizdo E, Glindmeyer HW, Petsonk EL, Enright P, Buist AS. Case definitions for chronic obstructive pulmonary disease. COPD: Journal of Chronic Obstructive Pulmonary Disease 2006; 3: 1–6. PubMed
7 Celli BR, Halbert RJ, Isonaka S, Schau B. Population impact of different definitions of airway obstruction. Eur Respir J 2003; 22: 268–273.
8 Aggarwal AN, Gupta D, Behera D, K Jindal SK. Comparison of fixed percentage method and lower confidence limits for defining limits of normality for interpretation of spirometry. Respir Care 2006; 51: 737–743. PubMed
9 Culver BH. We can do better than the GOLD standard. Respir Care 2006; 51: 719-721. PDF
10 Roberts SD, Farber MO, Knox KS, Phillips GS, Bhatt NY, Mastronarde JG, Wood KL. FEV1/FVC ratio of 70% misclassifies patients with obstruction at the extremes of age. Chest 2006; 130: 200–206. Text

See also: Receiver Operating Characteristic (ROC) curve

Top of page | | | ©Philip H. Quanjer