##
Comparing the assessment of diagnostic tests in epidemiology and clinical practice

### By HUW LLEWELYN, May 28 2021 10:21PM

Tests are usually assessed by estimating their sensitivities and specificities. These epidemiological indices give an indication of how well a single test will perform during population screening. The discriminating power of a combination of test results is also estimated with these epidemiologocal indices by assuming statistical independence between the likelihood of the individual fingings in the combination occurring in the presence and in the absence of the single diagnosis. (This assumption of statistical independence is usually false and leads to over-estimation of discriminating power.)

The resulting likelihood ratio based on statistical independence is applied to an individual by combining it with the subjective prior probability of a single diagnosis for that individual. Despite a tendency to overestimate probabilities, this Bayesian approach is regarded by many as the standard way in which diagnostic test results should be used to arrive at diagnostic probabilities. However, this tendency to overestimate probabilities might be reduced by considering more than one diagnostic possibility and then normalising the probabilities by ensuring that they add up to one, thus reducing the risk of confirmation bias and over-diagnosis.

The differential diagnostic process

During clinical practice, a patient’s presenting complaint (or screening tests result) is usually interpreted by considering a list of possible diagnoses, each with an estimated probability based on past experience. Another finding is then looked for that occurs commonly in one or more of these possibilities but less commonly in one or more of the others. When the result of such an investigation becomes known, the probabilities of each of the diagnoses in the original list are updated. Alternatively if a new test result has a shorter list of possible causes (another measure of a ‘good’ test), that list could be considered instead.

In this differential diagnostic setting, the calculations are based on the ratios of likelihood between pairs of diagnoses (analogous to ‘Bayes factors’ when testing stochastic hypotheses). This reasoning process with multiple diagnoses is based on a derivation of the extended form of Bayes rule and a dependence assumption (see Chapter 13 of the Oxford Handbook of Clinical Diagnosis - accessed via this link). As some diagnostic possibilities become improbable more evidence becomes available, only a few (or one) in the original list may remain probable. The diagnostician will then try to confirm the one of these probable diagnoses by demonstrating the presence of one of its 'sufficient' diagnostic criteria.

Diagnostic criteria

A diagnostic criterion may be ‘sufficient’, ‘necessary’ or both, the latter being ‘definitive’. A sufficient criterion (e.g. a positive PCR) by convention implies that the diagnosis is confirmed. A necessary criterion implies that if its finding is absent then the diagnosis cannot be confirmed. In practice there will be many sufficient criteria that provide a choice of how the diagnosis is confirmed. It is rarely if ever that a single test’s result or even a combination can be definitive (i.e. both sufficient and necessary). However, necessary criteria can be constructed in a circular way from all the recognised sufficient criteria so that absence of all the sufficient criteria excludes the diagnosis. Althogh use of the diagnosis is excluded, this does not exclude the possibility that the underlying disease is present.

The relationship between RCTs and diagnostic criteria

The purpose of a diagnosis is to suggest actions to help the patient (such as giving advice about what is going to happen with or without intervention). If an intervention has to be justified with the result of a randomised control trial, then the entry criteria for that trial generated from biomedical hypotheses have to be present for the intervention to be offered. In order to ensure that all those with the diagnosis will be offered the treatment then the entry criterion for the RCT could be used as one of the sufficient criteria for the diagnosis. Alternatively it should be ensured that those showing the entry criterion for the RCT are a subset of those with a sufficient criterion for the diagnosis. In some cases, the entry criteria for a RCT may exclude patients in danger of mild adverse effects (e.g. the elderly with co-morbidities). If the danger from an illness exceeded that from adverse effects (e.g. during decision analysis), then the trial exclusion factor would not be applicable. The sufficient criteria for a diagnosis would therefore need to be widened so as not to exclude those who might benefit from its suggested treatments.

Comparing different tests for use as diagnostic criteria

In order to assess the usefulness of tests as diagnostic criteria, the effect of using different tests or different test result ranges could be compared on trial outcomes. This could mean having to repeat RCTs when the efficacy of a treatment has already been established. However an alternative approache could be used by randomising subjects to different tests initially instead to treatment and control (see the preprint accessed by this link).