ZTuber Musical (3D in 2D)
While I was slicing up a famous painting to sing inside it, I was also rehearsing a new musical that wanted a 3D avatar to perform inside Zoom. Instead of the Zoom audience entering virtual reality, they wanted virtual reality to come to them. And sing. I call this experiment ZTubing or ZoomTubing.
A cursory dive into XR technologies demands recognition of the incredible work being done by VTubers. These creators livestream using augmented reality filters/software (like FaceRig) to turn themselves into a virtual character (often anime) tracking their face movements in real-time as they vlog on 2D video platforms like YouTube, Twitch, etc. As with any technology, there are varying degrees of success and image fidelity, with creators like CodeMiko at the highest level, showcasing $12K+ custom kits to track everything (and I mean everything, including their real world phone and computer) to stream entire virtual representations of their lives.
These workflows are just starting to become popular in the US (you may have used Apple’s Animoji’s to turn yourself into a cute animal via text message) and will likely become extremely popular as we all spend every part of our day on video chat (who needs to shower when you can be an animated character?) Whenever I present for colleges and high schools, students are using SnapCam and Zoom recently released their own basic AR filters.
A few months ago, I responded to an announcement from BigScreen that included a green screen “stage” inside their VR app, allowing their users to put avatars “anywhere.” I noted that I’d actually been organically doing a DIY version of this for a while using Mozilla Hubs and OBS to give clients tours of #FutureStages without them needing to log on.
I simply made a new “room” for free using Mozilla Hubs. Then I searched Sketchfab for a “green screen” and moved my desktop avatar in front of the green screen model so the green color filled the frame. Then I opened OBS and created a Window Capture Source of my web browser and chroma keyed out the green. Voila! Now I could put on my VR headset and stand between the desktop avatar (the camera) and the green screen and my audience would only see my avatar performing.
This process had become a regular part of my spiel for schools and companies, including Theater Resources Unlimited, an educational organization for independent producers in New York City. After my presentation to TRU, I was contacted by Jesica Garrou, who was directing a new virtual musical that featured an AI character. We live-streamed a creative brainstorming session where we discussed different ways for an actor to perform the AI character. Ultimately, Jesica enjoyed the meta nature of the artificial character in their own separate virtual world “joining” the Zoom call.
So I performed inside Mozilla Hubs and used OBS’s virtual camera as my Zoom camera to play off the other actors - bringing the 3D to 2D. (I guess when you think about it, we are all 3D characters performing 2D over Zoom, but you get the point).
We found a few tricks to optimize my performance (and experience) during rehearsals. First so much credit goes to Jesica’s partner Henry Garrou who created a stunning avatar for my character by customizing Mozilla’s default avatar Blender file. We actually used a LIDAR scan of my head for him to build the robot! He also created a backdrop for the character’s digital world.
Technologist, Daniel Abrahamson, and Musical Director, Ben Doyle McCormac, had me input a professional microphone (instead of the one built into my VR headset) for a better vocal performance. I actually kept my headset unmuted so the avatar’s mouth would flap when I talked, selling the liveness. This did however create a constant feedback/echo on my end that gave me a bit of a headache and messed with my timing, but I got used to it over time and actually learned to “jump” the lines of my fellow performers to avoid long pauses, knowing I was a few milliseconds “behind” everyone.
Finally, I pulled out my old baseball hats to prop up my headset using the brim. This allowed me to position the angle of the avatars head to look best on camera and I could monitor my feed and actually see my fellow actors to connect and respond more authentically with them. It felt a lot like traditional motion capture, balancing a large, expensive thing on my head to track my performance while I desperately try to act naturally.
With less blend shapes on the avatar’s face, I find much more of the character’s expression was found in the hands, voice, and posture. It felt like I was really chewing the scenery to give the character even simple, nuanced movements that didn’t feel too static. Where are my summer stock performers at? If you can fill an outdoor amphitheater, avatar acting is for you!
Will ZTubing become a thing? Probably not. But I’m genuinely excited for the endless possibilities and applications of bringing avatars, augmented filters and body tracking to live performance.