The evidence indicates that lower back pain screening instruments used in primary care perform poorly by assigning higher risk scores to individuals with chronic pain than to those who do not. Whereas risks of a poor disability outcome and prolonged absenteeism are likely to be estimated with greater accuracy. A restriction to English language studies and absence of a full search strategy means that some relevant studies might have been missed. In addition, the robustness of the findings was not demonstrated. Further studies on these screening instruments for determining risk of poor outcome in adults with lower back pain are needed to support the current findings.
Overall summary High risk of bias in the review
The search terms were reported, but full details of the search strategy were not reported, so it was not possible to judge if it was appropriate. Only English language publications were eligible. Robustness of the findings was not addressed.
|A. Did the interpretation of findings address all of the concerns identified in Domains 1 to 4?||Probably no|
|B. Was the relevance of identified studies to the review's research question appropriately considered?||Probably yes|
|C. Did the reviewers avoid emphasizing results on the basis of their statistical significance?||Probably yes|
|Risk of bias in the review||High|
|Number of studies||18|
|Number of participants||5,834|
|Last search date||June 2016|
|Objective||To evaluate the performance of low back pain screening instruments for determining risk of poor outcome in adults with low back pain of less than three months duration.|
|Population||Adult (>18 years of age) patients with recent onset acute (zero to six weeks) and subacute (six weeks to three months) low back pain, with or without leg pain.|
|Interventions||Prognostic screening instruments: the STarT Back Tool (SBT), the Orebro Musculoskeletal Pain Screening Questionnaire (OMPSQ), the Vermont Disability Prediction Questionnaire (VDPQ), the Back Disability Risk Questionnaire (BDRQ), the Absenteeism Screening Questionnaire (ASQ), the Chronic Pain Risk Score (CPRS), and the Hancock Clinical Prediction Rule (HCPR).|
|Outcome||Pain intensity (using a visual analogue scale, numeric rating scale, verbal rating scale or Likert scale), disability as measured by validated self-report questionnaires, sick leave or days absent from work or return to work status and self-reported recovery using a global perceived effect scale or a Likert (recovery) scale.|
|Study design||Prospective cohort studies with follow-up outcomes at a minimum of 12 weeks.
Retrospective cohort studies, analysis of a single arm of a randomised controlled trial or case series reports were excluded.
|PP factor||Prognostic screening instruments: the SBT, the OMPSQ, the VDPQ, the BDRQ, the ASQ, the CPRS, and the HCPR.|
The pooled analysis of five studies investigating the SBT observed the poor performance for discriminating pain (≥ three) outcomes at follow-up with pooled area under the curve (AUC) 0.59, 95% confidence interval (CI) 0.55 to 0.63, n = 1,153), Whereas it was acceptable for discriminating disability (≥ 30%) findings in adults with lower back pain (pooled AUC 0.74, 95% CI 0.66 to 0.82, three studies, n = 821).
Four of the seven studies investigating the OMPSQ reported poor performance for discriminating pain (≥ three) outcomes (pooled AUC 0.69, 95% CI 0.62 to 0.76, n = 360). Whereas performance was found to be acceptable and excellent for the disability (≥ 30%) outcomes (pooled AUC = 0.75, 0.69 to 0.82, three studies, n = 512) and six month absenteeism (> 28 days) (pooled AUC = 0.83, 95% CI 0.75 to 0.90, three studies, n = 243), respectively.
The summary estimate of five studies revealed poor performance on visual comparison of the discriminative performances of all instruments (five instruments) in terms of pain (≥ three) (AUC 0.63, 95% CI 0.60 to 0.65). Whereas the pooled performance was found to be acceptable on visual comparison of the discriminative performances of all instruments (SBT and OMPSQ) for disability (≥ 30%) outcomes (AUC 0.71, 95% CI 0.66 to 0.76, three studies).
The eligibility criteria were well described and appeared appropriate to address the present review question. No restriction was reported based on study characteristics and sources of information.
|1.1 Did the review adhere to pre-defined objectives and eligibility criteria?||Probably yes|
|1.2 Were the eligibility criteria appropriate for the review question?||Probably yes|
|1.3 Were eligibility criteria unambiguous?||Probably yes|
|1.4 Were all restrictions in eligibility criteria based on study characteristics appropriate (e.g. date, sample size, study quality, outcomes measured)?||Probably yes|
|1.5 Were any restrictions in eligibility criteria based on sources of information appropriate (e.g. publication status or format, language, availability of data)?||Probably yes|
|Concerns regarding specification of study eligibility criteria||Low|
MEDLINE, EMBASE, CINAHL, PsycINFO, PEDro, Web of Science, SciVerse SCOPUS, and Cochrane Central Register of Controlled Trials were searched to identify all relevant studies. The reference lists of all included articles and relevant review articles were searched to locate any additional studies. The search terms were reported, but full details of the search strategy were not reported. The searches were not limited based on time, but studies were limited to English language publications. Two review authors independently performed the study selection and any disagreements were resolved through discussion.
|2.1 Did the search include an appropriate range of databases/electronic sources for published and unpublished reports?||Yes|
|2.2 Were methods additional to database searching used to identify relevant reports?||Probably yes|
|2.3 Were the terms and structure of the search strategy likely to retrieve as many eligible studies as possible?||Probably no|
|2.4 Were restrictions based on date, publication format, or language appropriate?||Probably no|
|2.5 Were efforts made to minimise error in selection of studies?||Yes|
|Concerns regarding methods used to identify and/or select studies||High|
Two reviewers independently extracted relevant data using a standardised spreadsheet. Sufficient study characteristics appear to have been extracted to allow interpretation of the results and study results were appropriately collected for the synthesis. Two review authors independently assessed the methodological quality of the included studies using the quality in prognostic studies (QUIPS) tool. Disagreements in ratings were discussed and if not resolved, a third review author was consulted.
|3.1 Were efforts made to minimise error in data collection?||Yes|
|3.2 Were sufficient study characteristics considered for both review authors and readers to be able to interpret the results?||Probably yes|
|3.3 Were all relevant study results collected for use in the synthesis?||Probably yes|
|3.4 Was risk of bias (or methodological quality) formally assessed using appropriate criteria?||Probably yes|
|3.5 Were efforts made to minimise error in risk of bias assessment?||Yes|
|Concerns regarding methods used to collect data and appraise studies||Low|
The synthesis included all eligible studies. The method of analysis was explained and appeared appropriate. Predictive validity is conventionally assessed using receiver operating characteristic curve analysis, with area under the curve statistic being the most routinely reported measure of performance. Heterogeneity was assessed and found low in four comparisons, moderate in one comparison and substantial in one comparison. Post-hoc sensitivity analysis was undertaken to explore the influence of study variation. Robustness of the findings was not addressed. Quality of the individual studies was considered in the synthesis of findings.
|4.1 Did the synthesis include all studies that it should?||Probably yes|
|4.2 Were all pre-defined analyses reported or departures explained?||Probably yes|
|4.3 Was the synthesis appropriate given the degree of similarity in the research questions, study designs and outcomes across included studies?||Probably yes|
|4.4 Was between-study variation minimal or addressed in the synthesis?||Probably yes|
|4.5 Were the findings robust, e.g. as demonstrated through funnel plot or sensitivity analyses?||Probably no|
|4.6 Were biases in primary studies minimal or addressed in the synthesis?||Probably yes|
|Concerns regarding synthesis and findings||High|
- low back pain