――Please tell us how Mr. Tobizawa started working on 3D audio with headphones.
Tobizawa: In 2016, when the video was said to be the first year of VR, the video has evolved steadily and it has become possible to rotate 360 degrees, but I thought, "What about sound?" It is.
There was surround, and high resolution came out after 2000, but only people called enthusiasts could enjoy the benefits, and can it be said that it is evolution? I was shocked that it hadn't evolved at all since half a century before the appearance of stereo. Compression technology like MP3 has come out, but from the listener's point of view, it's not evolution, but degeneration.
――It is true that even if you take YouTube alone, the capacity of the video has become very large, but the sound is still compressed audio, and the difference is only widening. However, it is difficult to imagine the evolution of sound.
Tobizawa: Since the video is evolving in 3D with the word VR, I thought that it would be better to make the sound 3D as well. Then, in fact, when I investigated how to make the sound 3D, it was an era when various 3D panners etc. came out. I simply thought that 3D would be possible with them. Even if you look at the product description, it says that you can express a three-dimensional space by using this, so you should introduce it.
At that time, I was trying various things at my own studio in Ichigaya, Tokyo, but I couldn't make it as three-dimensional as I expected. I was just thinking about moving, so I took the plunge and moved the entire studio to Shibuya by establishing a new studio for VR and 3D audio only. Believe that if you prepare the environment, the sound will be as you expected.
However, when I tried all the equipment and software, I couldn't express it at all. Even if I listen to it with headphones and pan it behind, I don't understand at all, and if I move it up and down, I don't understand more.
――There are several types of software on the market, but do you mean that their effects are not good enough?
Tobizawa: In short, the performance of binaural processing was low, and I felt that it was far from a level that could be recognized firmly. That was 2017, and I was about to give up for a while. But I thought again, "No, that shouldn't be the case."
Originally, I was good at creating space when doing LR stereo mixes. This sound is far ahead, this is one step behind, two steps behind, and so on, so I feel the possibility that I should be able to do it because I have made a mix that allows you to feel a natural space even if you are wearing headphones. It was. So, even if I couldn't express it with just one software, I started thinking that I could somehow express the back by using it while combining existing plug-ins.
――Is that the time when you asked for the VR mix of the song "Sweet My Heart feat. Kanako Kotera" at DTM Station Creative? It seems that you were using the Dolby Atmos headphone monitor mechanism. (See 767)
Kanako Kotera's mini-album "Sweet My Heart" produced by the authorTobizawa: That was exactly the time. There was a simple monitor system for mixing Dolby Atmos, so while using it, I used Waves' NXZ, and the one that felt the most exciting was Wave Arts' "Panorama 5" (currently the latest version). Was Panorama 6). Although it was hard to say that I was convinced by myself, that song was the one that I could express as much as I could at that time by using these properly depending on the sound source.
――At that time, I thought that was the limit of what the current technology could do, but did you evolve from that?
Tobizawa: I went back to the starting point of how to express a three-dimensional space and delved into it. I used to express depth in the LR stereo mix, so how can I put it behind? One way was to make it out of phase. This makes me feel very uncomfortable, but there is a sound behind me. Opposite phase is forbidden from the common sense of the mix. But I wondered if I could use this well. If you give a phase change to the sound source and express the uncomfortable feeling, you may be able to localize it behind.
――It's true that if you put it in the opposite phase, it will make a very unpleasant and strange sound.
Tobizawa: I tried endlessly to try using a short delay rather than simply using the opposite phase. Ultra short delay of about 1msec to 15msec in terms of time. Then, somehow the possibility came into view. By using this short delay as the first reflection, something could be done. As a result, we arrived at a method that can express the space by controlling the phase by attaching separate short delays in a total of eight directions, including the upper layer and the lower layer in the four directions of front, back, left and right. I was there.
――Since there are 8 directions, do you mean to run a total of 8 delays?
Tobizawa: That's right. Actually, one direction is divided by 90 degrees, so 16 delays are used. Humans recognize these delays as first reflection and come to capture the sound three-dimensionally. At this time, use the 3D panner to determine how long each of the 16 delays should be set. In other words, it is Ambisonics encoded, and the delay time and send amount are calculated and controlled for each direction by panning with this spatial information.
Category
Related Articles
Hot Articles