Panoptic reconstruction of immersive virtual soundscapes using human-scale panoramic imagery with visual recognition
Huang, Mincong (Jerry)
MetadataShow full item record
This work, situated at Rensselaer's Collaborative-Research Augmented Immersive Virtual Environment Laboratory (CRAIVELab), uses panoramic image datasets for spatial audio display. A system is developed for the room-centered immersive virtual reality facility to analyze panoramic images on a segment-by-segment basis, using pre-trained neural network models for semantic segmentation and object detection, thereby generating audio objects with respective spatial locations. These audio objects are then mapped with a series of synthetic and recorded audio datasets and populated within a spatial audio environment as virtual sound sources. The resulting audiovisual outcomes are then displayed using the facility's human-scale panoramic display, as well as the 128-channel loudspeaker array for wave field synthesis (WFS). Performance evaluation indicates effectiveness for real-time enhancements, with potentials for large-scale expansion and rapid deployment in dynamic immersive virtual environments.