Measuring quality of life in trials including patients on haemodialysis: methodological issues surrounding the use of the Kidney Disease Quality of Life Questionnaire

ABSTRACT Background Haemodialysis (HD) treatment causes a significant decrease in quality of life (QoL). When enrolled in a clinical trial, some patients are lost prior to follow-up because they die or they receive a kidney transplant. It is unclear how these patients are dealt with in the analysis of QoL data. There are questions surrounding the consistency of how QoL measures are used, reported and analysed. Methods A systematic search of electronic databases for trials measuring QoL in HD patients using any variation of the Kidney Disease Quality of Life (KDQoL) Questionnaire was conducted. The review was conducted in Covidence version 2. Quantitative analysis was conducted in Stata version 16. Results We included 61 trials in the review, of which 82% reported dropouts. The methods to account for missing data due to dropouts include imputation (7%) and complete case analysis (72%). Few trials (7%) conducted a sensitivity analysis to assess the impact of missing data on the study results. Single imputation techniques were used, but are only valid under strong assumptions regarding the type and pattern of missingness. There was inconsistency in the reporting of the KDQoL, with many articles (70%) amending the validated questionnaires or reporting only statistically significant results. Conclusions Missing data are not dealt with according to the missing data mechanism, which may lead to biased results. Inconsistency in the use of patient-reported outcome measures raises questions about the validity of these trials. Methodological issues in nephrology trials could be a contributing factor to why there are limited effective interventions to improve QoL in this patient group. PROSPERO Registration CRD42020223869


INTRODUCTION
An estimated 800 000 people living in America rely on dialysis treatment for end-stage renal disease (ESRD) [1]. These patients have a significant treatment and symptom burden, greatly affecting their quality of life (QoL). QoL is defined by the World Health Organization as 'an individual's perception of their position in life in the context of the culture and value systems in which they live and in relation to their goals, expectations, standards and concerns' [2]. Low levels of QoL among these patients have led to an increasing number of clinical trials focusing on improvements in QoL. However, many trials conclude without being able to meaningfully improve QoL [3]. Existing literature highlights the poor methodological quality of nephrology trials [4], which could be contributing to the lack of meaningful results.
Patient-reported outcome measures (PROMs) are key to assessing self-reported QoL. The Kidney Disease Quality of Life (KDQoL) Questionnaires are well-validated, reliable, condition-specific PROMs [5] designed to provide a comprehensive assessment of QoL among patients with ESRD and score highly in psychometric properties (consistency, validity Measuring Table 1

. Differences between MCAR, MAR and MNAR mechanisms
Missing data mechanism, according to Rubin [14] Assumption Example MCAR Missing data and HRQoL outcome are independent Participant moves abroad The reason for dropout is unrelated to the participants' current health status MAR Missing data/dropout depend on the observed longitudinal measurements If male participants are less likely to report HRQoL data and dropout Dropouts related to baseline characteristics MNAR Missing data/dropout depend on the unobserved longitudinal measurements Dropout due to adverse effect, transplantation and death Directly related to the participant's current health status Missing values cannot be modelled exclusively from the data of the observed participants and reliability). There are three versions of the questionnaire [6][7][8], which are described in Fig. 1. All versions of the KDQoL have the Short Form (SF)-12/SF-36 embedded in the questionnaire, a widely used instrument measuring two distinct components of QoL: physical and mental [9]. The KDQoL questionnaires have been validated in many patient populations [10,11] to ensure that they accurately capture changes in the QoL of patients with ESRD when used in a clinical trial. It is important for validated questionnaires to be administered according to the specifications of their developers in order to retain the desired properties. This review looks at how closely the use of the KDQoL questionnaires aligns with the recommendations of the developers.
Missing KDQoL data is common in trials of haemodialysis (HD) patients, as relatively high proportions of patients either die or receive a transplant before completing the trial. Much literature exists discussing methods for dealing with missing data and the consequences of not doing so [12]. Previous reviews highlight the use of complete case analysis and single imputation methods to deal with missing QoL data [13]. However, these methods are only valid under strong assumptions about the missing data mechanism, i.e. whether they assume the missing QoL data to be missing completely at random (MCAR), missing at random (MAR) or missing not at random (MNAR) [14]. A detailed explanation of these concepts is included in Table 1 and Supplementary data,  Appendix Table A1. Guidelines to impute and use complete case analysis are only valid if the missing data are random (i.e. MCAR or MAR) and unrelated to the treatment or intervention. Limited guidance exists on what to do otherwise 2540 and questions remain about how missing data are dealt with in practice.
We systematically reviewed published trials that measured QoL in HD patients using the KDQoL questionnaires to address the following questions: How do trials use, report and analyse the KDQoLs questionnaires? How do trials account for missing KDQoL data (specifically death/transplant) in their analysis?

MATERIALS AND METHODS
This systematic review is reported in line with the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) statement [15]. The PRISMA checklist is provided in Supplementary data, Appendix Table A2. The protocol for this review has been published elsewhere [16].
The search strategy was developed with the assistance of a specialist health sciences librarian and reviewed by a nephrologist. MEDLINE, Web of Science, Cochrane Central Register of Controlled Trials, Scopus and Cumulative Index of Nursing and Allied Health Literature were searched using combinations of keywords and topics. The original search strategy, developed in MEDLINE, is included in Supplementary data, Appendix Table A3. Databases were searched from inception to 16 November 2021. Searches were limited to publications available in English. Due to the methodological nature of the review, ongoing studies and unpublished trials were excluded.

Inclusion and exclusion criteria
We included phase 3 clinical trials of any design measuring QoL using any version of the KDQoL questionnaire in adults (age ≥18 years) receiving HD. QoL could be a primary or secondary outcome.
We excluded trial protocols and reports of secondary analyses. We excluded trials that recruited a mix of patient treatments (HD, peritoneal dialysis and transplantation).

Screening
The review was conducted using Covidence version 2 software. All searches were imported into Covidence. Duplicates were removed. Title and abstract screening was conducted independently by two reviewers (H.W. and G.W.). Full-text screening was conducted by three reviewers (H.W., G.W. and H.Y.). Each study was reviewed independently by at least two reviewers and any disagreements between reviewers were resolved by discussion.

Data extraction
Data extraction was performed in Covidence using a predetermined extraction form. Pilot extraction was conducted on eight studies to amend and retest the extraction form. Two reviewers (H.W. and H.Y.) performed data extraction independently and any differences were resolved by consensus. Authors of trials with insufficient information to complete data extraction were contacted for further information.

Analysis
The information extracted was exported and tabulated. The results were synthesized using descriptive statistics. The quantitative analysis was conducted in Stata/IC version 16.0 (StataCorp, College Station, TX, USA).

Deviations from the protocol
Initial search strategies included both the KDQoL and SF-36 as measures of QoL. It was agreed by all authors that trials assessing QoL using the SF-36 could be omitted due to the number of trials found using the KDQoL (n = 399). This review is not aimed at analysing the intervention effects and focuses on the methodological quality of trials, therefore it was agreed a risk of bias assessment was unnecessary.

Study characteristics
A PRISMA flow diagram detailing the identification of studies is displayed in Fig. 2. The number of articles identified for title and abstract screening was 4376. After the exclusion of the SF-36 articles, the number of articles meeting the inclusion and exclusion criteria was 399. The final review consisted of 61 trials. Throughout data extraction, 14 authors were contacted: 11 regarding their calculation of the KDQoL total score, 2 regarding their statistical analysis and 1 regarding the methods for dealing with missing data. Only one author responded to the e-mail.
The study characteristics for included studies are presented in Table 2 Table 3 presents how the individual trials reported the domains and summary scores for the KDQoL questionnaires. This table highlights the inconsistencies in reporting in current practice. Generally, trials do not use the kidney disease component summary (1) [75] or kidney summary score (0) [76] to summarize the kidney disease-specific domains. The summary scores from the SF-12/36, physical component score (PCS) and mental component score (MCS) were used by 40% (n = 24) of the trials.
The number of trials generating a 'KDQoL total score' was 16 (27%). The methods used to calculate the total score are included in Supplementary data, Appendix Table A4. Most of these trials [11 (69%)] failed to explain the methods for calculating the total score. The authors of these trials (n = 11) were contacted for further information and one responded to the e-mail. Of the six trials for which we could determine the methods for calculating the total score: two took an average of the 19 domains; one took an average of the 11 kidneyspecific domains; one took the median value of the domains; one summed the PCS, MCS, effects, burden and symptoms scores and one used a visual analogue scale (VAS) of overall health.

Statistical analysis
We evaluated the statistical techniques used in the trials to make comparisons between treatment groups. This is detailed in Supplementary data, Appendix Table A4.
Between-group comparisons. The majority of trials [48 (79%)] did a between-group analysis that was unadjusted for other factors. A total of 11% of trials did not conduct a comparison between groups. This included the trial where between-group comparisons were not possible due to only considering repeated measures within a single group. Only 10% of trials adjusted for baseline covariates in the comparison between groups.
Within-group analysis. Almost half of trials conducted a within-group analysis [26 (41%)]. Most of these trials [17 (84%)] conducted their within-group analysis alongside a between-group analysis. The remaining studies [9 (16%)] only reported a within-group comparison.

Missing data
Details relating to missing data are provided in Table 4. The extent of missing data due to dropouts relative to the number of patients randomized is detailed in Fig. 3. Almost a third (30%) of trials had >20% of patients drop out, despite the majority [42 (69%)] of trials having a duration of <6 months.
Most trials [45 (74%)] included a Consolidated Standards of Reporting Trials (CONSORT) flow diagram detailing the reasons for patient dropout post-randomization. A total of 22 trials (36%) considered the possibility of dropouts and inflated the required sample size accordingly, although only 17 of these 22 stated explicitly by how much. The expected dropout for these trials ranged from 5 to 40%, [interquartile range (IQR) [10][11][12][13][14][15][16][17][18][19][20]], but did not seem to be related to the duration of follow-up. Four trials (7%) mentioned that the high dropout rate may cause bias and could limit the interpretation of results.
Methods for dealing with missing QoL data (primary data analysis). A total of 11 trials (18%) reported no missing data between randomization and the study endpoint, 45 (74%) used complete case analysis to deal with missing data, 4 (7%) used imputation and 1 (2%) was unclear on the methods and did not respond to e-mail. Three trials (5%) used single imputation methods. Single imputation methods included one trial carrying forward the baseline QoL data, while the other two trials carried forward the last observation. One trial used multiple imputation by using propensity methods to replace missing QoL values. Only one trial explicitly mentioned the missing data mechanism assumption when justifying the methods for dealing with dropouts.
Sensitivity analysis relating to missing QoL data. Sensitivity analysis relating to missing QoL data was conducted by five trials (7%). Four trials (7%) conducted either complete case analysis or single imputation for their sensitivity analysis. The fifth trial performed two types of sensitivity analyses: imputing patients who died with a value of 0 and performing multiple imputation. All trials concluded that the sensitivity analysis did not change the interpretation of the results.

Deaths.
A total of 27 trials (44%) recorded dropout due to death, with the extent of dropouts ranging from 1 to 24% of the total number of patients randomized [median 4% (IQR 2-8)]. The only death-specific imputation found in this review was one trial that imputed QoL values to zero for patients who died.

Transplants.
A total of 28 trials (46%) recorded dropout due to transplants, with the extent of dropouts ranging from 1 to 38% of the total number of patients randomized [median 4% (IQR 2-8)]. No transplant-specific imputation analyses were found when reviewing these trials.

DISCUSSION
The aim of this review was to explore how current nephrology trials use, report and analyse the KDQoL questionnaires when evaluating QoL in patients receiving HD treatment. The review identified a number of methodological issues, including amending validated versions of the questionnaires against the recommendations of the developers, reporting a KDQoL total score, reporting only statistically significant             results, failing to account for missing QoL data appropriately (specifically death/transplant data) and using limited methods in the statistical analysis of trials. The above methodological issues may be biasing the results of these trials and contributing to the limited number of nephrology trials concluding with positive results and therefore impacting clinical practice. This, in turn, could be limiting the opportunity for improvements in the QoL of the HD population. These findings support previous literature relating to the poor methodological quality of nephrology trials [4]. However, this is the first article to examine the reporting quality of KDQoL and explore the methods used in the primary data analysis to account for dropouts, especially due to death and transplant.

KDQoL reporting and analysis
This review identified inconsistencies in how trials reported the results of the KDQoL questionnaires, including generating a single index of QoL, which is not recommended by the developers of the KDQoL questionnaires due to the multidimensional nature of the tool [7]. The misuse of a KDQoL total score is a common issue among users of the SF-36. A review of the use of the SF-36 total score [77] found 172 articles calculating a total score as a single measure of health, against the recommendations of the developers. In line with our findings, many [129 (75%)] were unclear on the methods used to calculate the total score. The KDQoL developers emphasize the need to analyse physical and mental health domains separately, similar to the recommendations for the SF-12/36 [78]. Researchers were also found to have modified the standardized KDQoL questionnaire, excluding certain domains due to the focus of the trial (e.g. fatigue) or sensitivity of the questions (e.g. sexual function), and/or reported only those domains that were statistically significant in their trial publications. The tendency to report only significant domains is a form of reporting bias, suggesting that some authors may be cherry-picking significant results and presenting these as the main results to emphasize their findings.

Appropriate use of statistical methods
In this review, trials reporting the KDQoL as their primary outcome did not explicitly specify which component of the KDQoL formed their primary outcome. These trials referred to multiple domains when reporting the effectiveness of the interventions, making it unclear to readers the focus of the trials. Lack of clarity in the primary outcome can also lead to questions about the sufficiency of the sample size and power of the trial. Generally these trials provided vague explanations of their sample size calculations or omitted this information completely. Many trials conducted within-group statistical comparisons, comparing measurements at baseline and followup, which have been widely reported to be invalid and produce conclusions that are potentially misleading [79]. As well as this, only a few trials adjusted for baseline covariates, which generally improve the efficiency of the analysis, leading to a substantial increase in power [80]. Trials used linear models and linear mixed models to analyse the longitudinal evolution of health-related quality of life (HRQoL), which are valid when MCAR and MAR assumptions are met. Despite this, few trials discussed whether dropouts were MNAR, MAR or MCAR, and 75% of trials had at least one dropout due to transplants, adverse effects or death. Therefore it is likely these models were used in invalid conditions, which increases the potential risk of bias.

Missing data
Missing data were a common occurrence in the trials reviewed. The potential bias due to missing data depends on the reason for the missingness. Complete case analysis and single imputation methods assume that missing data are MCAR, meaning the reason for dropout is unrelated to the intervention or disease. However, in these trials, missing data were commonly due to death, transplant, ill health or treatment switching. This means that the dropout was likely related to the intervention or disease and therefore not MCAR. Few trials performed sensitivity analysis to assess the impact of the missing data assumptions on the results or discussed the potential bias due to missing data.
Similar investigations into missing data in other populations have found that complete case analysis and single imputation methods are widely used for dealing with dropouts in clinical trials. Thabut et al. [81] conducted a review of missing data in 16 idiopathic pulmonary fibrosis trials: 50% (n = 8/16) of trials conducted complete case analysis, 31% (n = 5/16) conducted last observation carried forward and the remaining trials conducted various single imputation methods. Hamel et al. [13] conducted a review of the methodological quality of cancer trials when analysing QoL data. A total of 33 trials were included in this review and 94% (31/33) of trials conducted complete case analysis to deal with missing QoL data. It seems sensible to conclude that missing data due to dropouts are poorly dealt with across many medical specialities and more robust statistical techniques are needed to account for these events in clinical trials.
Measuring quality of life in trials including patients on haemodialysis

Strengths and limitations
The search strategy for this review was developed and reviewed by a consultant nephrologist and a health sciences librarian. This work is based on published trials available in English and may therefore be subject to publication and language bias. However, as this is a method-based review, it is not anticipated that this will have a significant impact on the results. It has been reported that language restrictions do not lead to evidence of systematic bias in review-based analyses [82]. Several trials (n = 14) did not have sufficient information in their articles to populate the extraction form. These authors were contacted but very few responded (n = 1). However, we believe the included data provide sufficient evidence to make conclusions on the current practice of nephrology trials relating to the handling of dropouts and KDQoL reporting. Our study also adhered to the PRISMA reporting guidelines.

Implications for future research
This review highlights the lack of implementation of appropriate methods when dealing with dropouts in clinical trials and the inconsistencies in reporting the validated KDQoL questionnaires. There is currently no consensus on dealing with dropouts due to death, transplantation and ill health, which are common causes of attrition in the HD population. There is an urgent need for nephrology trials to become more methodologically coherent. Poor reporting and inappropriate analyses of QoL data lead to uncertainty over which treatments may have a significant impact on the QoL of patients receiving HD. By addressing these methodological limitations, the quality of clinical trials within the field of nephrology will be enhanced, increasing their ability to influence clinical practice for the benefit of people receiving HD and their families.

C ONCLUSIONS
Inadequate reporting and handling of missing QoL data in RCTs still exists. It appears that there exists a large gap between statistical methods for dealing with missing data and their application in practice. This work forms the basis for future guidance on addressing missing QoL data in clinical trials. This review focused on nephrology trials, which have a unique form of dropout due to transplantation, but it is intended that future method development and guidelines apply to any setting where QoL data are collected. It also highlights the inconsistency of reporting the KDQoL, the failure of reporting a primary outcome measure, cherry-picking results and altering validated questionnaires; these are statistical issues that researchers should avoid. Journals must enforce good practice to ensure a higher standard of research. Better, more robust reporting will further identify treatments that could improve QoL within the HD population.