Tag Archives: PROMs

doctor and patient

Guest blog: An investigation of sample size calculations in surgical trials

by Chloe Jacklin, Jeremy N Rodrigues, Joanna Collins, Jonathan Cook, Conrad J Harrison
Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford, Oxford, UK

We can all recognise the importance of the number of participants in a randomised controlled trial (RCT). Too few participants risks statistical errors, and too many will be overly expensive, and worryingly, unnecessarily expose participants to the risks of research1. To calculate the appropriate number of study participants, trialists must decide a target difference between the two intervention groups that would be considered meaningful. This decision becomes even more challenging when using a patient reported outcome measure (PROM) because, without context, PROM scores are challenging to interpret.

PROMs are defined as “a measurement of any aspect of a patient’s health that comes directly from the patient, without interpretation of the patient’s response by a physician or anyone else”2. Their use has gained popularity and credibility3, not least because it promotes patient-centred care but also because it has gained recognition from governing and advisory bodies2,4. This is further relevant to surgery where new initiatives to foster patient-centred research have been instigated to tackle criticisms of low quality evidence5,6. It is therefore important researchers, clinicians, and funding bodies are aware of the principles of measurement science underlying PROMs and their use in sample size calculations.

The Difference ELicitation in TriAls (DELTA2) guidelines outline the required reporting items for sample size calculations and provide guidance on rigorous target difference determination1. The target difference should be the PROM’s minimal important difference (MID). A popular definition of MID is “the smallest difference in score in the domain of interest which patients perceived as beneficial and which would mandate, in the absence of troublesome side-effects and excessive cost, a change in the patient’s management”7. There are several methods to estimate MIDs which vary in methodological rigour. It is important to be aware that some methods are rather arbitrary and not patient-centred such as using half of the standard deviation (also known as Cohen’s D), and some are superior such as anchoring the PROM to a global change score. The optimal method is to triangulate several good estimates of the MID8–10. Furthermore, the context-specific nature of MIDs must be appreciated because they balance the benefits and disadvantages of an intervention for a given population, treatment, and follow-up duration9. Therefore, an out-of-context MID may compromise a trial’s results.

We used DELTA2 to appraise the sample size calculations in RCTs where the intervention and/or comparator was a surgical intervention, and a PROM was used in the sample size calculation. We looked at trials published in high impact journals from the last 6 years because these are the most cited in their fields and have large international readerships of clinicians, academics and policy makers. A total of 57 were eligible, of which 51 were superiority design.

We found that sample size calculations in high profile surgical RCTs that used a PROM as their primary outcome were suboptimal compared to the contemporary DELTA2 standards. This included missing reporting items, using relatively arbitrary methods to determine the target difference; unclear justification for the target difference; and the application of MIDs calculated in different contexts. Of note, our sample included trials supported by £28 million of UK public research funding that had poor target difference justification.

Our results may reflect the demands for prompt and pragmatic answers to clinical research questions with convenient but suboptimal MIDs, and desire for cost-effective trials by opting for larger target differences.

While we acknowledge the difficult balance between delivering timely answers to clinical questions versus investment in measurement science, there are potential solutions. Recent advances in trial methodology may lead to improvements in target difference setting11–13. For example, adaptive trial designs allow trialists to dynamically refine trial-specific MIDs and adjust sample sizes accordingly. Funding bodies, research ethics committees and journals act as the gateway to research, and could drive improvements in RCT measurement quality by actively promoting alternative trial designs and enforcing careful target difference determination. Rigid budgets and risk aversion of commissioners and funding applicants present potential obstacles; however, this needs to be balanced against the risk to participants and excess cost caused by poor sample size calculations.


  1. Cook JA, Julious SA, Sones W, et al. DELTA 2 guidance on choosing the target difference and undertaking and reporting the sample size calculation for a randomised controlled trial. BMJ. 2018. doi:10.1136/bmj.k3750
  2. Services H. Guidance for industry: Patient-reported outcome measures: Use in medical product development to support labeling claims: Draft guidance. Health Qual Life Outcomes. 2006;4:1-20. doi:10.1186/1477-7525-4-79
  3. Black N. Patient reported outcome measures could help transform healthcare. BMJ. 2013;346(7896):1-5. doi:10.1136/bmj.f167
  4. Bottomley A, Jones D, Claassens L. Patient-reported outcomes: Assessment and current perspectives of the guidelines of the Food and Drug Administration and the reflection paper of the European Medicines Agency. Eur J Cancer. 2009;45(3):347-353. doi:10.1016/j.ejca.2008.09.032
  5. McCall B. UK implements national programme for surgical trials. Lancet. 2013;382(9898):1083-1084. doi:10.1016/S0140-6736(13)62009-7
  6. England RC of S. Surgical Trials Initiative — Royal College of Surgeons. https://www.rcseng.ac.uk/standards-and-research/research/surgical-trials-initiative/. Accessed October 15, 2021.
  7. Jaeschke R, Singer J, Guyatt GH. Measurement of health status. Ascertaining the minimal clinically important difference. Control Clin Trials. 1989;10(4):407-415. doi:10.1016/0197-2456(89)90005-6
  8. Rodrigues JN, Mabvuure NT, Nikkhah D, Shariff Z, Davis TRC. Minimal important changes and differences in elective hand surgery. J Hand Surg Eur Vol. 2015. doi:10.1177/1753193414553908
  9. Rodrigues JN. Different terminologies that help the interpretation of outcomes. J Hand Surg Eur Vol. 2020;45(1):97-99. doi:10.1177/1753193419870100
  10. Chan KBY, Man-Son-Hing M, Molnar FJ, Laupacis A. How well is the clinical importance of study results reported? An assessment of randomized controlled trials. CMAJ. 2001.
  11. Dimairo M, Pallmann P, Wason J, et al. The Adaptive designs CONSORT Extension (ACE) statement: A checklist with explanation and elaboration guideline for reporting randomised trials that use an adaptive design. BMJ. 2020. doi:10.1136/bmj.m115
  12. Thorlund K, Haggstrom J, Park JJ, Mills EJ. Key design considerations for adaptive clinical trials: A primer for clinicians. BMJ. 2018. doi:10.1136/bmj.k698
  13. Park JJH, Thorlund K, Mills EJ. Critical concepts in adaptive clinical trials. Clin Epidemiol. 2018. doi:10.2147/CLEP.S156708

Image source: Enlivity 2021 Creative Commons