It is vital that the study has an adequate sample size. This is necessary to ensure that the study has a good chance of detecting a statistically significant result if this is the true effect and also to ensure that adequate resources are allocated. A study that has an inadequate sample size will have a low probability of detecting a statistically significant result and therefore represents a waste of valuable resources. Such studies add nothing to scientific knowledge.

It is essential that the term "statistically significant" is not confused with clinical significance. To say that two groups are statistically significantly different from each other means that there is likely to be a genuine difference between the groups. However, statistical significance says nothing about the size of the difference. For example, the difference may be statistically significant but may be so small that it is clinically of no importance. In order to achieve an efficient use of resources, the number of subjects should be sufficient to detect an effect of clinical importance, yet not so large that effects too small to be of interest are detected.

If you wish to estimate the average daily intake of fibre in the population, how many subjects would be needed? To calculate this you need an estimate of the likely range of fibre intake in the population. Using this, the number of subjects you need to survey can be calculated, such that there will be 95% confidence that the true population mean is within, say, 5g of the sample mean intake.

The number of subjects needed for a particular study may be greater than you might expect. For example, how many subjects would be needed for a randomised controlled trial of the effect of dietary advice on plasma cholesterol? If the intervention reduces plasma cholesterol by 10%, you would need a total of 170 subjects (85 intervention and 85 controls) to give a 90% chance of obtaining a statistically significant result (P<0.05). In order to calculate the number of subjects for this study you need to know the population mean and standard deviation for plasma cholesterol and the size of the difference you wish to detect as statistically significant. The formulae used to calculate sample size are widely available in statistics textbooks. A simplified method, in the form of a nomogram, has been published in the British Medical Journal (1980;281:1336-1338). Alternatively, you should consult a statistician.

Another example is a randomised controlled trial among men who have recently had a heart attack. How many subjects would be needed to have a high probability of detecting an effect of the intervention on subsequent mortality? To calculate this you need to estimate the reduction in mortality that could be expected as a result of the intervention. This can obviously only be an approximation, as the true effect cannot be known at the study design stage. However, previous research studies and mortality statistics can provide a guideline. If the expected reduction in mortality is say 30% over a two year follow-up period, a total of 1600 men would be needed to give a 90% chance of obtaining a statistically significant result. If the true effect of the intervention is to reduce mortality by only 20% then the sample size would need to be 4000 men.