Within-subjects study in 31 vocally healthy adults: auditory, visual, and audiovisual room cues in immersive VR all measurably change self-perceived vocal loudness, effort, comfort, and acoustic output

Daşdöğen Ü et al. · 2023 · Journal of Voice · Experimental · n = 31 · Vocally healthy adults · DOI
Evidence certainty: Moderate certainty
How this was rated

Within-subjects design with 31 vocally healthy adults across 18 carefully crossed conditions (auditory-only, visual-only, audiovisual, ± background noise) - a strong factorial design for hypothesis-testing. Peer-reviewed in Journal of Voice (Elsevier, established peer-reviewed voice journal). Self-reported and objective acoustic measures combined. Limitations: vocally healthy adults only (does not test clinical voice populations); single VR system (research-grade); the 18-condition design optimises mechanism-testing over clinical-protocol validation. The findings support the realism-and-validity construct for voice-in-VR but do not directly establish therapeutic efficacy in voice patients - that requires follow-on work in clinical populations.

Ratings use a simplified four-tier scheme (High, Moderate, Low, Very Low) informed by the GRADE working group. Learn more about how studies are rated.

Thirty-one vocally healthy men and women were tested under 18 sensory-input conditions in immersive virtual reality - two auditory rooms with different reverberation times, two visual rooms with different volumes, and audiovisual combinations - each with and without background noise. Speakers performed counting, sustained vowels, an all-voiced CAPE-V sentence, and a Rainbow Passage sentence. Self-perceived vocal loudness and effort INCREASED, and self-perceived vocal comfort DECREASED, as room volume, speaker-to-listener distance, audiovisual richness, and background noise increased. Sound pressure level (SPL) and spectral moments (mean, SD, skewness, kurtosis) showed concomitant changes. Visual and audiovisual input - not just auditory - measurably shaped voice production.

Clinical bottom line

A controlled within-subjects experimental study in 31 vocally healthy adults showing that visual and audiovisual room cues in immersive VR - not just acoustic cues - measurably change self-perceived vocal loudness, effort, and comfort, AND change acoustic output (SPL and spectral moments). This is foundational realism-and-validity evidence for using immersive VR in voice therapy: it establishes that the immersive visual context can drive vocal adaptations beyond what acoustic simulation alone produces. Clinicians using or considering immersive VR for voice work should expect the visual environment to be a meaningful therapeutic variable, not a backdrop.

Key findings

  • 31 vocally healthy adults (men and women) tested under 18 sensory-input conditions in immersive VR: 2 auditory rooms (varying reverberation) × 2 visual rooms (varying volume) × audiovisual combinations × with/without background noise
  • Self-perceived VOCAL LOUDNESS increased as room volume, speaker-to-listener distance, audiovisual richness, and background noise increased
  • Self-perceived VOCAL EFFORT increased under the same conditions
  • Self-perceived VOCAL COMFORT decreased - the inverse pattern, consistent with effort-comfort tradeoff
  • Objective acoustic outputs (sound pressure level [SPL] and spectral moments - mean, SD, skewness, kurtosis) changed in line with the self-reports - speakers automatically adjusted their voice to the perceived room
  • Visual and audiovisual input - not just auditory cues - measurably shaped voice production. This is the first immersive-VR evidence that the visual environment is a meaningful therapeutic variable in its own right, not just a backdrop for acoustic simulation
  • Speech tasks spanned counting, sustained vowel phonation, an all-voiced CAPE-V sentence, and the first sentence of the Rainbow Passage - covering both phonation and connected speech

Background

Voice therapy traditionally happens in a clinic room - quiet, acoustically dry, with no visible audience. The voice the client produces in that room is often very different from the voice they produce in the real-world settings where their voice problem actually matters (large rooms, noisy backgrounds, social or performance audiences). Acoustic simulation alone (reverberation, background noise) partly addresses this, but immersive VR offers something acoustic simulation cannot: a synchronised VISUAL environment that the client can see, including room size, perceived listener distance, and ambient context.

Whether the visual environment actually drives measurable vocal adaptations beyond the acoustic environment had not been systematically tested in immersive VR.

What the researchers did

31 vocally healthy adults (men and women) were tested under 18 sensory-input conditions in immersive VR. The 18 conditions were created by crossing:

Each participant completed all 18 conditions, performing four speech tasks per condition: counting, sustained vowel phonation, an all-voiced CAPE-V sentence, and the first sentence of the Rainbow Passage.

Outcomes were self-perceived vocal loudness, effort, and comfort (each rated 0-100); plus objective acoustic measures - sound pressure level (SPL in dB) and spectral moments (spectral mean and SD in Hz, skewness, kurtosis).

What they found

Why this matters

For voice clinicians considering immersive VR, this study establishes that the immersive visual context drives measurable changes in vocal output and self-perceived voice - BEYOND what acoustic-only simulation can achieve. Clinically, this means the choice of scenario in a VR voice-therapy session (small cafe vs. large auditorium vs. noisy classroom) is a therapeutic decision affecting expected vocal adaptations. The study is foundational evidence for voice-in-VR work that has since proliferated (e.g., Leyns 2025 RCT for gender-affirming voice training, Hoff 2026 voice meditation, Daşdöğen 2026 follow-on).

Limitations

Implications for practice

For voice clinicians considering immersive VR as a therapy tool: the immersive visual context drives measurable changes in vocal output and self-perceived voice, BEYOND what acoustic-only simulation can achieve. This is foundational evidence that immersive VR has a unique affordance for voice therapy (e.g., training projection to realistic distances, voice-in-noise habituation, ecologically valid environmental cueing for behavioral voice goals). Clinicians using Therapy withVR or similar products for voice work should treat the choice of scenario (cafe vs. auditorium vs. classroom) as a therapeutic decision, not a cosmetic one. The study is in vocally healthy adults, so clinical efficacy in voice-disordered populations still needs direct testing. The same research team (Daşdöğen and colleagues) published a 2026 Journal of Voice paper extending this work; see dasdogen-2026 in this Hub.

Cite this study

If you reference this study in your work, the canonical citation formats are:

APA 7th
Daşdöğen Ü, Awan, S. N., Bottalico, P., Iglesias, A., Getchell, N., & Verdolini Abbott, K. (2023). The Influence of Multisensory Input on Voice Perception and Production Using Immersive Virtual Reality. Journal of Voice. https://doi.org/10.1016/j.jvoice.2023.07.026.
AMA 11th
Daşdöğen Ü, Awan SN, Bottalico P, Iglesias A, Getchell N, Verdolini Abbott K. The Influence of Multisensory Input on Voice Perception and Production Using Immersive Virtual Reality. Journal of Voice. 2023. doi:10.1016/j.jvoice.2023.07.026.
BibTeX
@article{daden2023,
  author = {Daşdöğen Ü and Awan, S. N. and Bottalico, P. and Iglesias, A. and Getchell, N. and Verdolini Abbott, K.},
  title = {The Influence of Multisensory Input on Voice Perception and Production Using Immersive Virtual Reality},
  journal = {Journal of Voice},
  year = {2023},
  doi = {10.1016/j.jvoice.2023.07.026},
  url = {https://withvr.app/evidence/studies/dasdogen-2023}
}
RIS
TY  - JOUR
AU  - Daşdöğen Ü
AU  - Awan, S. N.
AU  - Bottalico, P.
AU  - Iglesias, A.
AU  - Getchell, N.
AU  - Verdolini Abbott, K.
TI  - The Influence of Multisensory Input on Voice Perception and Production Using Immersive Virtual Reality
JO  - Journal of Voice
PY  - 2023
DO  - 10.1016/j.jvoice.2023.07.026
UR  - https://withvr.app/evidence/studies/dasdogen-2023
ER  - 

Know of research that should be in this hub? If a relevant peer-reviewed study is not listed here, send the reference to hello@withvr.app. The hub is kept up to date as the literature grows.

Funding & independence

Affiliations: New York University, Orlando, Champaign IL, Newark DE. Funding details and conflict-of-interest disclosures not extracted in the abstract excerpt available for this summary. Open or paywalled status: Journal of Voice (Elsevier). No withVR BV involvement in funding, study design, or authorship. Summary prepared independently by withVR using the published peer-reviewed paper. The immersive VR system used was a research-grade custom configuration, NOT Therapy withVR or Research withVR.

Last reviewed: 2026-05-17 Next review due: 2027-05-17 Reviewed by: Gareth Walkom