From a video stand point they shot the same guy in about 10 different places. They used stop motion for the visual effects, but they just recorded the sound naturally to exemplify the difference a room makes in terms of reverb.
Reverb is basically how music echos in a natural setting.
Ever wondered why jazz clubs always have red or black velvet curtains on the wallls? No reverb.
If you watch the video again check out the difference in sound between the underground cellar and the church.
Basically reverb is how sound bounces off certain materials (large marble church vs small wooden wine cellar) and how big the room is - bigger means longer time to echo, but it also sounds bigger. Smaller means a more dead sound.
In the 50s doo wop groups always recorded in bathrooms or echo rooms (specifically made for sound in studios - Columbia is the best know for this).
You can reproduce it digitally, but it never sounds as good as the real thing.
Here's another example: https://www.newyorker.com/magazine/2017/07/24/a-water-tank-turned-music-venue
The coolest video you'll ever see about...reverb.