A paper lands in your inbox. Someone on your team says “look at this VR study, it sounds useful.” You want to know what to make of it before your next session or your next commissioning meeting. Where do you even start?

This is a short guide to reading a VR speech therapy study with a critical eye. Not a research methods course. Not a statistics primer. Just a practical set of questions a speech-language professional can hold in mind to tell the difference between a study that supports a clinical decision and a study that is interesting but not ready to change what you do.

Start with who, not what

Before anything else, read the Participants section. Who was in this study?

If the population in the study is very different from the people you see in clinic, the findings do not necessarily transfer. This is not a criticism of the study. It is a reminder that no single study answers every question, and evidence needs to be matched to the population you care about.

Understand what they actually compared

The next section worth reading is Design. What did the researchers compare?

Ask yourself: if the intervention had no effect at all, is there any other reason the outcomes might have changed across conditions? If the answer is “yes, many reasons,” then the design is weak for a causal claim. A good study design rules out most alternatives.

Look at what they measured

The Outcome measures section tells you what the researchers decided counted as evidence. This matters because different measures tell different stories.

The most convincing VR validation studies combine measures. If anxiety goes up on SUDS and heart rate and voice measures shift consistently, that is stronger evidence than any single measure alone. Watch for studies that report only one type of measure - they tell a partial story.

Check whether the effect is actually large

A finding can be statistically significant and practically meaningless. This is a hard lesson. It happens because statistical significance depends on sample size: a tiny difference will be statistically significant if the sample is large enough.

What you want is an effect size. Common ones in this literature:

If a paper reports only p-values without effect sizes, that is a weakness. If it reports effect sizes, check them. A large p-value with a small effect size can still be clinically uninteresting even if the statistics are legitimate.

Read the limitations section (seriously)

Authors know their own studies’ limitations better than you do. Read what they say. A good limitations section will tell you:

If a paper’s limitations section is a single throwaway paragraph, treat the findings cautiously. If the authors have thought carefully about what their study can and cannot tell us, give the paper more weight.

Distinguish feasibility from effect

A lot of early VR research is about feasibility rather than effect. A feasibility study asks: “can this be done at all? Will participants tolerate it? Does the equipment work as intended?” These are legitimate research questions, and the findings can be informative - but they are not evidence that the intervention works.

A feasibility study with five participants showing anxiety decreased across a week tells you that a week of practice is feasible. It does not tell you that VR caused the change. Other things could have - practice effects, expectation, the researcher’s attention, regression to the mean.

When you see a small-sample pre-post VR study with favorable results, ask: “is this a pilot telling me the idea is worth a bigger study, or is this being presented as evidence of effect?” The first is useful. The second would be overclaiming.

Ask about generalization honestly

Most VR studies measure responses inside the virtual environment. Fewer measure whether gains transfer to real-world situations. And yet what clients usually want is change in real life, not in a virtual room.

Questions to hold:

If none of these are present, the study cannot tell you much about real-world transfer. That is not a flaw - it is a limitation of scope. But it matters when you are deciding what a study supports.

Check who funded the study

The Funding and Conflicts of Interest declarations are worth reading. Independent funding from research councils, universities, or government bodies is different from industry funding or a study conducted by a company on its own product.

Neither kind of funding automatically invalidates a study. But knowing who paid for it and who has a financial stake in its results helps you weigh the findings. A study on virtual audiences funded by a research council carries different weight than a study on a specific VR product conducted by that product’s company.

A short checklist

If a VR speech therapy study comes across your desk, these six questions will get you most of the way:

The 6-question checklist

Reading a VR speech therapy study with a critical eye

1Who was studied?Sample size and population.5 = pilot · 15 = small ·50+ = generalizable.2What was the design?Within / between / pre-post /RCT. What alternatives doesit rule out?3What was measured?Self-report, behavior,physiology, acoustics.Multiple measures = stronger.4How big is the effect?Cohen's d, r, or eta-squared- not just the p-value.5What did authors flag?Read the limitations sectionseriously. A thin one is itselfa signal.6Did it test transfer?Was anything measuredoutside the VR setting?That is the clinical question.Six questions. No statistics background needed.The paper itself answers each one in plain language somewhere.Slow down enough to find the answer for all six.
  1. Who was studied? Sample size and population. 5 = pilot, 15 = small, 50+ = generalizable.
  2. What was the design? Within / between / pre-post / RCT. What alternatives does it rule out?
  3. What was measured? Self-report, behavior, physiology, acoustics. Multiple measures = stronger.
  4. How big is the effect? Cohen's d, r, or eta-squared - not just the p-value.
  5. What did authors flag? Read the limitations section seriously. A thin one is itself a signal.
  6. Did it test transfer? Was anything measured outside the VR setting? Real-world transfer is the clinical question.

Print or save this card. None of these questions requires a statistics background - they ask what the paper itself usually answers in plain language.

None of this requires a statistics background. It requires slowing down and asking the questions authors usually answer in plain language somewhere in the paper.

Further reading