Most of the people I work with who use Therapy withVR are not researching anxiety. They are speech-language professionals supporting clients who stutter, clients doing voice work, autistic adults preparing for difficult conversations, or someone on the cusp of a job interview. But anxiety almost always shows up. A speaker whose speech feels easy in the clinic room and locks up in front of a meeting. A voice client whose pitch and resonance are exactly right with the SLP and gone the moment they speak to a stranger. A teenager who has the words but not the willingness to use them in class.
For decades, the treatments most explicitly designed for this kind of social-evaluative anxiety have come out of the anxiety-differences literature, not the speech-therapy literature. That gap is closing - there is now meaningful direct work on speaking anxiety in stuttering and voice contexts - but the deepest body of evidence on VR for socially-evaluative speaking lives in trials done on social anxiety disorder and fear of public speaking. So when a clinician asks me, fairly, what evidence supports VR as a way to support clients in everyday speaking situations, the honest answer involves walking back through what those adjacent literatures actually show.
This post pulls together three randomized controlled trials and one meta-analysis. Each is a study you can take seriously - adequately sized, methodologically careful, peer-reviewed in respected venues. None of them was designed specifically to test speech therapy. But each tells us something useful about whether and how VR-based practice can support people in the kinds of everyday speaking situations that matter to them.
The four studies at a glance
VR-based exposure for social and public-speaking anxiety
Anderson et al. - VR exposure equivalent to group CBT
Both active groups significantly improved over waitlist; no statistical difference between VR exposure and group CBT. Gains maintained at 12-month follow-up.
Wallach, Safir & Bar-Zvi - VR-CBT equals traditional CBT, lower dropout
Equivalent reductions in public-speaking anxiety. Dropout was lower in the VR condition; participants rated VR-CBT as more attractive.
Bouchard et al. - pre-registered superiority of VR over in-vivo
VR exposure was significantly more effective than real-world exposure on the primary outcome at post-treatment and at 6-month follow-up. Authors documented practical advantages: less cumbersome, lower cost, easier confidentiality.
Opriş et al. - dose-response across the literature
Pooled effect sizes show large gains over waitlist and equivalent effects to non-VR active treatments. More sessions produced larger gains - a clear dose-response pattern.
All four studies are linked individually in the sections below and in the Evidence Hub. Sample sizes show how each compares: under 20 = pilot, 20-50 = small, 50+ = generalizable.
Anderson 2013: VR exposure equals group CBT, with gains that last
Anderson and colleagues (2013) ran a randomized controlled trial with 97 adults experiencing social anxiety. Participants were assigned to one of three groups: VR exposure as individual sessions, group cognitive-behavioral therapy as a well-established active comparator, or a waitlist. Both active conditions ran for the same number of sessions over the same period.
Both active groups produced significant improvements over waitlist. The two active groups produced equivalent improvements as each other. There was no statistical difference between VR exposure delivered individually and group cognitive-behavioral therapy delivered in a group setting. And the gains were maintained at 12-month follow-up - not just durable for a few weeks, but stable over a year.
What I take from this for speech-therapy practice. Group CBT is logistically demanding and not always accessible. It also requires the person to participate in a group setting, which is often the very thing they find hardest. VR exposure gives clinicians a way to deliver an equivalent exposure-based experience individually. For clients who avoid groups for the same reasons that brought them to therapy in the first place, this matters.
Wallach 2009: VR-CBT equals traditional CBT for public-speaking anxiety, with lower dropout
Wallach, Safir, and Bar-Zvi (2009) ran an 88-person RCT focused specifically on public-speaking anxiety. Participants were randomly assigned to VR-based cognitive-behavioral therapy, traditional CBT, or a waitlist. Both active conditions ran for 12 weekly sessions.
The headline finding was that VR-CBT and traditional CBT produced equivalent reductions in public-speaking anxiety, both significantly larger than the waitlist effect. But there was a secondary finding that I think gets too little attention. Dropout was lower in the VR condition. Participants rated VR-CBT as more attractive than traditional CBT.
Real-world treatment effects depend on two things, not one. They depend on per-session efficacy, and they depend on whether people complete the course of treatment. A modality that is equally efficacious per session but produces less dropout delivers more aggregate benefit at the population level than its per-session effect alone would suggest. If twenty people start an intervention and fifteen finish it, that intervention helps fifteen people. If twenty people start a different but equally efficacious intervention and eighteen finish it, that intervention helps eighteen people. The dropout difference is not a side note.
Bouchard 2017: pre-registered superiority trial shows VR outperforms real-world exposure
The trial that I find most methodologically interesting is Bouchard and colleagues (2017), published in British Journal of Psychiatry. They ran a three-arm RCT with 59 adults diagnosed with social anxiety disorder. The three arms were CBT with in-virtuo (VR) exposure, CBT with in-vivo (real-world) exposure, and a waitlist control. Both active arms received 14 weekly sessions of identical structure - only the modality of exposure differed.
The crucial methodological detail is that this was a pre-registered superiority trial. The researchers committed in advance to testing whether VR exposure was more effective than in-vivo exposure - a hypothesis that sets a higher bar than equivalence. Pre-registration matters because it forecloses some of the post-hoc analytic flexibility that has historically inflated effect estimates in clinical research.
The result: VR exposure was significantly more effective than in-vivo exposure. On the primary outcome (LSAS-SR), VR outperformed in-vivo exposure at post-treatment (t(56)=2.02, p<.05) and at 6-month follow-up (F(1,37)=4.78, p<.05). Reliable change rates were 76.5% for VR and 68.3% for in-vivo - a directionally consistent gap that did not reach statistical significance on its own, but the primary continuous outcome did. The researchers also documented practical advantages for the VR condition. In their own words, in-virtuo exposure was “significantly less cumbersome and costly to conduct than in vivo ones, in terms of having access to relevant stimuli to induce ridicule, duration, preparation, worries about confidentiality, costs of gathering staff members to attend at public speaking exercises.”
Bouchard et al. 2017 · n = 59 · pre-registered three-arm RCT
Reliable change after 14 weeks of CBT for social anxiety
Both active treatments produced substantially more reliable change than waitlist. VR slightly outperformed in-vivo here; the gap was directionally consistent but the LSAS-SR continuous outcome was where significance landed.
Source: Bouchard S, Dumoulin S, Robillard G, et al. (2017). Virtual reality compared with in vivo exposure in the treatment of social anxiety disorder: a three-arm randomised controlled trial. British Journal of Psychiatry, 210(4), 276-283. DOI: 10.1192/bjp.bp.116.184234.
I want to draw a specific implication from this. The argument for VR-based speaking practice is sometimes framed as “VR is a useful preparatory step before real-world practice.” This trial supports a stronger reading. VR exposure is not a way of getting ready for real-world exposure. It is an exposure modality in its own right - one that in this trial produced superior outcomes on the primary measure while being significantly less effort to deliver.
Opriş 2012: a meta-analytic dose-response pattern
The fourth study I want to discuss is not a trial but a meta-analysis. Opriş and colleagues (2012) pooled effect sizes across multiple primary studies of VR exposure for anxiety disorders. They found, as the trials reviewed above also suggest, that VR exposure produced large effect sizes versus waitlist and equivalent effects to non-VR evidence-based interventions.
But the finding I want to highlight is a different one: the dose-response relationship. Across studies, more sessions produced larger gains. This is a pattern consistent with how exposure-based therapies are theorized to work - through repeated, structured engagement with the feared situation, allowing habituation and cognitive change to accumulate. It is not a property unique to VR, but it has a specific clinical implication for VR-based work.
The implication is this: a one-session demonstration is not the right test of what VR-based speaking practice can do. The mechanism is sustained, multi-session practice - graded, controlled, repeated - and the evidence suggests its benefits scale with practice volume. Clinicians working with VR should plan multi-session practice progressions, not one-off trials.
Four studies, one direction
Effect-size forest plot of VR-versus-comparison contrasts
Four social-anxiety studies, plotted with reported effect sizes
Top panel: against active comparison conditions, no study favored the comparator over VR; one (Bouchard 2017) favored VR. Bottom panel: against no-treatment waitlist, VR produced large pooled effects. Convergent evidence with consistent direction across study designs.
Sources: Anderson et al. 2013 (J Consult Clin Psychol), Wallach et al. 2009 (Behav Modif), Bouchard et al. 2017 (Br J Psychiatry, DOI), Opriş et al. 2012 (Depress Anxiety, DOI). Anderson 2013 is shown qualitatively because the original PDF is rasterized and specific Cohen's d values could not be machine-extracted; the source paper reports equivalence between VR and group CBT on the LSAS. Bouchard's d estimate is derived from the reported t(56)=2.02 via d = 2t/√df. Wallach's d is the pooled VRCBT+CBT-vs-WL effect on LSAS-Avoidance. Opriş is the meta-analytic Cohen's d weighted across primary outcomes.
What this means for speech therapy
These four studies were designed to address social anxiety disorder and fear of public speaking, not stuttering, voice work, autistic communication, or any other communication-specific context. I want to be careful not to overclaim. The transfer from “VR exposure works for social anxiety” to “VR practice supports speaking confidence in a person who stutters” is a transfer of evidence across populations, and that transfer needs its own validation. The Evidence Hub already has direct studies of VR with people who stutter - Brundage and Hancock (2015), Brock (2023), Kumar (2024) - and growing work in voice. The most recent direct test in autism is McCleery et al. (2026), a parallel RCT in autistic teens and adults that compared three short VR sessions with video modeling for police-interaction practice. The VR group gave significantly more appropriate responses and showed calmer body language during a real-world post-test with actual officers; the video-modeling control did not. Those direct studies do most of the heavy lifting for population-specific claims.
What the social-anxiety literature does is establish four things that are useful as background.
One: graded, controllable, repeatable exposure works. The mechanism is established across multiple anxiety conditions in trials with active comparators and adequate samples. The active ingredient is not VR per se - it is exposure delivered in a way that allows the clinician to control intensity, repeat reliably, and progress in steps the person can manage. VR is a way to deliver those properties; the properties matter more than the medium.
Two: VR can replace in-vivo exposure, not just precede it. Bouchard’s superiority finding makes this point more directly than equivalence would. For clinicians working in settings where arranging real-world speaking practice is difficult - rural areas, schools with privacy constraints, contexts where confidentiality matters - this is more than licensing. VR is not a fallback when in-vivo is unavailable; it is an exposure modality in its own right - one that in this trial produced superior outcomes on the primary measure with significantly lower logistical burden.
Three: dropout matters as much as per-session efficacy. The Wallach finding deserves more clinical attention than it gets. A modality that engages people enough to keep them in treatment delivers more aggregate benefit, even at equivalent per-session effects.
Four: dose matters, and one session is not the test. The dose-response pattern in the meta-analysis argues for planning multi-session practice progressions rather than treating VR as a one-shot demonstration. A common failure mode I see in early clinical use is over-relying on the “wow” of the first session. The evidence suggests that is the wrong place to put the weight.
What I do not claim
I am not claiming that these studies prove VR-based practice works for stuttering, voice work, or any other communication-specific context. The direct studies in those areas are smaller, less mature, and at a different stage of evidence development. The honest framing is: a robust adjacent evidence base supports the modality and its underlying mechanism, the direct evidence is smaller and growing, and clinicians should weigh both.
I am also not claiming that VR replaces clinical expertise. Every trial above was designed by clinicians, delivered with clinician judgment, and evaluated with clinician oversight. The technology is the medium, not the practitioner. The findings translate when the practice is good practice.
A note on the next twenty years
Much of the social-anxiety VR work I have referenced here is now ten to twenty years old. The technology has changed substantially - from tethered headsets in research labs to standalone consumer devices that fit in a clinic bag. The structure of the practice has not changed. Graded, controllable, repeatable exposure remains the active mechanism. What changes is access: who can deliver it, where, and at what cost.
That is why I think the evidence we already have - adjacent though it is - supports cautious confidence about the direction of the field, even as direct evidence continues to accumulate in the specific areas speech-language professionals work in. The gap is closing. And in the meantime, two decades of VR research on social anxiety are doing more useful work for speech-therapy practice than they sometimes get credit for.
Further reading
- Speaking Anxiety topic - Evidence Hub topic with all hub studies on speaking anxiety
- Anderson et al. (2013) - The 97-person social-anxiety RCT
- Bouchard et al. (2017) - Pre-registered superiority RCT: VR more effective than in-vivo exposure on the primary outcome
- Wallach et al. (2009) - Public-speaking anxiety RCT with the lower-dropout finding
- Opriş et al. (2012) - Meta-analysis with the dose-response pattern
- Ecological Validity in VR Speech Therapy - What the evidence on virtual-versus-real audiences shows
- VR for Gender-Affirming Voice Training: What the First RCT Found - Direct application of the willingness-to-communicate finding to voice work
- How the NHS Uses VR to Support Young People Who Stammer - The same research being applied in NHS settings
- How to read a VR speech therapy study - A guide for interpreting research like this
- Further reading - Books and communities that shape current practice
