How studies are rated

Every study in the Evidence Hub carries a certainty rating. This page explains what the ratings mean, how they are decided, and the honest state of the evidence in VR for speech, voice, and communication work.

Ratings use a simplified four-tier scheme (High, Moderate, Low, Very Low) informed by the GRADE working group. The rating reflects how confidently the study's findings can be applied - not the quality of the study authors or their work. A Very Low rating does not mean the study is bad; it often means the study is a pilot or case series, which is exactly what you want early in a field.

Current distribution

High 9 studies
Moderate 39 studies
Low 36 studies
Very Low 23 studies

What each rating means

High certainty

Very confident that the true effect lies close to the estimate. Expected for findings where multiple high-quality randomized controlled trials converge, across different sites and research groups, with minimal risk of bias, inconsistency, or indirectness. A single study never qualifies for High on its own - High is a property of a body of evidence, not one paper.

Moderate certainty

Reasonably confident in the estimated effect; the true effect is likely close but could plausibly differ. Typical for well-designed single RCTs with adequate samples, and for systematic reviews of heterogeneous primary studies.

Low certainty

Limited confidence. The true effect may be substantially different from the estimate. Common for small-sample RCTs, quasi-experimental designs, and qualitative studies - all of which contribute real knowledge, but cannot alone support firm conclusions.

Very low certainty

Very little confidence in any estimate of effect. Case studies, small pilots, and narrative or conceptual papers sit here. These studies are still valuable - they establish feasibility, surface questions, and ground later controlled work - but they are not evidence of effect.

Why no studies are currently rated High

This is the honest answer: VR in speech therapy is a young research field. For a claim to reach High certainty, the literature usually needs multiple high-quality RCTs, ideally pre-registered and multi-site, converging on the same finding. In most of the questions the Evidence Hub addresses, this body of evidence does not yet exist.

The strongest candidates in the hub currently are single RCTs in adjacent areas - social anxiety (Anderson 2013), autism (Ip 2018), stuttering (Cream 2010), tinnitus (Malinvaud 2016). All sit at Moderate. Each is a good study; none on its own carries enough weight to raise a claim to High without replication.

The absence of High ratings in this hub is not a gap in the hub. It reflects where the field is. A rating distribution that reads "0% High / ~20% Moderate / ~35% Low / ~45% Very Low" is what you would expect for a rigorously-assessed young research area. Inflated ratings would be easier on the eye and much less honest.

What could warrant a High rating

Future additions to the hub that might reach High include:

Realistically, this level of evidence is several years away for most claims in VR speech therapy. A plausible candidate for early promotion to High is the validity of virtual audiences for producing comparable communicative responses to real audiences - the evidence here is converging steadily as new work replicates earlier findings. A formal systematic review of this specific question would be welcome.

How ratings are decided

Each study's rating is assigned editorially by withVR, drawing on the paper's design (RCT / quasi-experimental / case / review), sample size, population, and the paper's own stated limitations. A short rationale accompanies each rating, visible on the study page by expanding "How this was rated."

Ratings reflect editorial judgment, not a formal GRADE assessment process of the kind used in Cochrane reviews. The scheme is simplified deliberately: four tiers are enough to signal how confidently clinicians and researchers should apply a finding, without implying a precision the editorial process does not have.

Corrections and suggestions welcome

If you believe a study has been rated incorrectly, or a study has been missed that warrants inclusion, send a note to hello@withvr.app. The scheme is intended to be transparent and correctable.

Further reading

Know of research that should be here? If a peer-reviewed study on VR in speech, voice, hearing, or communication work is not listed, send the reference to hello@withvr.app.