Engineering + user-reception study (Computers & Graphics 2025) of a speech-controlled VR system for voice and public-speaking training: extracts pitch / timbre / speech rate from 529 utterances by 15 students for real-time virtual character response
How this was rated
Engineering / user-reception study with a 15-participant speech corpus and 6 expert annotators. Peer-reviewed in Computers & Graphics (Elsevier, special section on XRIOS 2024). The paper's contribution is system design and user-reception evaluation, not clinical efficacy. Limitations: not a clinical trial; speech corpus is small for generalizability to clinical populations; voice parameters extracted are technical-engineering features (pitch, timbre, rate) rather than clinically validated voice-handicap measures.
Ratings use a simplified four-tier scheme (High, Moderate, Low, Very Low) informed by the GRADE working group. Learn more about how studies are rated.
An engineering and user-reception study published in Computers & Graphics special section on XRIOS 2024. Polish-British collaboration (AGH Krakow, SWPS Warsaw, Polish Academy of Science, Kielce University of Technology, University of Cambridge). The system is built on a speech recordings corpus of 529 utterances during presentations by 15 students. Voice parameters extracted: pitch, timbre, speech rate. Six expert annotators evaluated stress levels per presentation. The multi-parameter analysis selects features for real-time animation of virtual characters responding dynamically to speech changes. The contribution is design and user-reception evaluation rather than clinical efficacy.
An engineering / user-reception study of a speech-controlled VR system for voice and public-speaking training. The contribution is design methodology (speech corpus, parameter extraction, real-time animation control) rather than clinical evidence. For voice clinicians and researchers, this paper illustrates an emerging affordance in VR: virtual characters that respond DYNAMICALLY to the speaker's voice parameters in real time. Not appropriate as clinical efficacy citation; useful as methodology and design reference for next-generation VR voice-training systems.
Key findings
- Engineering + user-reception study published in Computers & Graphics special section on XRIOS 2024
- Speech-controlled VR system: virtual characters respond DYNAMICALLY to the speaker's voice parameters (pitch, timbre, speech rate) in real time
- Speech recordings corpus: 529 utterances given during presentations by 15 students
- Voice parameters extracted using speech processing methods: pitch, timbre, speech rate - then mapped to real-time animation control of virtual characters
- Six expert annotators evaluated stress levels present in each presentation - mixed-methods feature for stress-modulated character response
- Polish-British international collaboration: AGH University of Science and Technology (Krakow), SWPS University (Warsaw), Polish Academy of Science (Krakow), Kielce University of Technology, University of Cambridge
- Emerging VR affordance illustrated: virtual characters that respond DYNAMICALLY to speaker behavior - moving beyond pre-recorded audience animations toward true responsive social-VR systems
Background
Most VR voice and public-speaking training systems use pre-recorded audience animations - the virtual audience does not respond to what the speaker actually says or how they say it. Real-time, speech-controlled virtual characters that respond to the speaker’s voice parameters and stress levels are a next-generation design direction. By 2024-2025, the engineering pipeline for this was maturing.
What they did and found
A speech-controlled VR system was built on a corpus of 529 presentation utterances from 15 students. Voice parameters (pitch, timbre, speech rate) were extracted using speech processing methods. Six expert annotators evaluated stress levels. The multi-parameter analysis selected features for real-time animation control of virtual characters that respond dynamically to speech changes. User-reception evaluation followed.
Why it matters
For voice clinicians and SLP researchers, this paper illustrates the engineering trajectory toward responsive virtual characters in VR voice-training contexts. Methodology and design reference for next-generation clinical VR systems.
Limitations
Not a clinical trial. Small speech corpus. Engineering-feature voice parameters rather than clinically validated voice-handicap measures.
Implications for practice
For voice clinicians and SLP researchers, this paper illustrates the engineering trajectory toward VR systems with virtual characters that respond DYNAMICALLY to the speaker's voice and stress parameters. This is a meaningful design direction for next-generation VR voice and public-speaking training systems - moving beyond static or pre-recorded virtual audiences toward responsive social-VR contexts. Not appropriate as clinical efficacy evidence; use as methodology reference for clinical-engineering collaboration. For Therapy withVR product design, the speech-parameter-to-character-animation pipeline is a relevant emerging affordance.
Cite this study
If you reference this study in your work, the canonical citation formats are:
@article{bartyzel2025,
author = {Bartyzel, P. and Igras-Cybulska, M. and Hekiert, D. and Majdak, M. and Łukawski, G. and Bohné, T. and Tadeja, S.},
title = {Exploring user reception of speech-controlled virtual reality environment for voice and public speaking training},
journal = {Computers & Graphics},
year = {2025},
doi = {10.1016/j.cag.2024.104104},
url = {https://withvr.app/evidence/studies/bartyzel-2025}
}TY - JOUR
AU - Bartyzel, P.
AU - Igras-Cybulska, M.
AU - Hekiert, D.
AU - Majdak, M.
AU - Łukawski, G.
AU - Bohné, T.
AU - Tadeja, S.
TI - Exploring user reception of speech-controlled virtual reality environment for voice and public speaking training
JO - Computers & Graphics
PY - 2025
DO - 10.1016/j.cag.2024.104104
UR - https://withvr.app/evidence/studies/bartyzel-2025
ER - Know of research that should be in this hub? If a relevant peer-reviewed study is not listed here, send the reference to hello@withvr.app. The hub is kept up to date as the literature grows.
Funding & independence
Affiliations: AGH University Krakow, SWPS University Warsaw, Polish Academy of Science, Kielce University of Technology, University of Cambridge. Funding sources reported in published article. Peer-reviewed in Computers & Graphics (Elsevier). No withVR BV involvement.