I had the pleasure of collaborating again with Nikk Mitchell, the fantastic CEO of FXG who is loved by the whole VR community. After the review of the Huawei VR glasses we did together, this time I interviewed him and FXG’s CTO Wilson Li about the incredible work they’re doing with VR videos in China. FXG has released a professional 180 VR camera able to record 12K (6K per eye) videos and is collaborating with Pico in the realization of VR concerts.
This is probably one of the most important topics of this interview: Pico is investing literally millions of dollars to create concerts with famous Chinese singers that are only meant to be streamed in VR. They can be enjoyed via Pico Video, with the users being able to share some small videos about them on TikTok, creating a big buzz on social media. Pico has recently announced that is going to do something similar in the West, so what Nikk is doing is a bit like a time machine projected in the future of what we’ll see here soon. In the remainder of the article, you can enjoy the interview, with also some exclusive footage of the backstage of the concerts and of the concerts enjoyed inside Pico Video (China).
You can appreciate this all in the video embedded below, or by reading the (slightly edited) transcription that appears after it.
Ah, before we start, since Nikk is very generous, we organized for you a giveaway of FXG T-shirts, and you can participate using the form here below!
a Rafflecopter giveawayFXG Interview – Video Version
FXG Interview – Textual Version
Hello Nikk, hello Wilson. Thanks for connecting with me from the distant China. Let’s start from the basics. Nikk, can you introduce FXG and your team?
Nikk: FXG is a VR film tech company based out of Hangzhou, China. Hangzhou is an hour away from Shanghai, the home of Alibaba and a bunch of other tech giants that people [in the West] maybe never heard of. It’s a great city and we make VR film technology. Our most important [hardware] products are VR cameras. Released internationally now is the FM DUO: it’s a 12K full-frame VR camera, and now in China, it’s become the standard for doing super high-resolution 8K live streams. That’s what we’ve been up to a lot, working with ByteDance and the other big tech giants in China and soon around the world, helping them to create amazing high-quality VR live streams.
Let’s start with one of these amazing projects. You are collaborating with Pico, which is now aiming at the consumer VR market both in China and the West. You did some concerts with some Chinese artists, if I remember well. Can you tell us something about this?
Nikk: These projects have been really meaningful for me on a deep emotional level. I’ve been making VR films for half a decade and I filmed so many concerts. I remember first trying to go film a concert and people wouldn’t even let me film. Then, finally, people heard about VR. I found people willing to give it a try and they would let me film. But it’d be like, “Don’t get in the way of anybody.” They put me in the back beside all the other long telephoto lens cameras and I would get the worst footage.
Then, finally, they’d be like, “All right, you can go near the stage, but don’t get in the way of any of the other camera shots.” With ByteDance or with Pico, now it’s different.
For everybody who doesn’t know, ByteDance is the company that owns TikTok and now they also own Pico which is the leading Chinese VR headset. When I say Pico or ByteDance, they’re the same company. It’s like Oculus or Meta currently.
The first concert I did with Bytedance was for the singer Wang Xi, also known as Elvis Wang, which is an up-and-coming singer. That was a really special concert because when we filmed, they had like 1,000 people in the audience, but right in the center. Wang, he’s here, and like a meter in front of him, they put one of our FM DUO cameras. Then 5 meters behind that, at a high angle, they had another camera.
I was amazed because this was right in the view of everybody. Do you know what I mean? Before, people were telling me to stay out of the way, while here they wanted me to be in the perfect shot. They gave us the ideal space for everybody to see us, which was so amazing. The Wang Xi concert was a really huge thing because they put us right in the middle and we had two camera angles, which in VR, people would switch back and forth from the long distance to the close distance… and they let us have priority above the audience.
Then we went and filmed Michael Zheng (Chinese name ZhengJun). He’s the Bob Dylan of China. Everybody that has Wilson’s age is in love with him. I mean, I don’t know if Wilson is really in love with him, but he grew up listening to him.
When we filmed that concert, there was no audience. It was a huge production, in dollars, it would be in the millions of dollars, including the stage, which was built just for this one concert. The team, including our FXG, had maybe 20 people on set. ByteDance had a load of people. They had a production company. They had all these people. A huge production and there was no audience. There was an audience, but there was no meat, body physical audience who were there.
To me, that was an unbelievable moment where actually VR film is so important that they would spend millions of dollars and bring a huge star for a concert that would only go to the metaverse, to people watching it through virtual reality. That’s now what we’ve been doing this year, working with ByteDance on a huge project after a huge project, and it’s all about VR. They’re putting all of this money into hiring these amazing artists, and it’s all about creating epic VR experiences and they’re really epic.
Wow. It’s really incredible what you’re saying, but I want to go a bit deeper and ask Wilson and you: how was the pipeline to create this concert? I don’t think you just put two cameras and then everything was done. How did it work on your side and also on the users’ side?
Nikk: I’ll start on a shallow level. The setup was, for the Michael Zheng one, it was eight cameras. It was four different positions. We’re in the metaverse space. They have the metaverse virtual audience space. If you go [in the virtual world] over there then it switches to that camera. If you go over there, it switches over there. When you go on stage, it switches to the onstage camera.
We had all of these different cameras set up. We went there and we hooked up the cameras. Before, we had to make sure they had high-speed internet. Then, Wilson, where does it go from that? From the cameras set up then what’s the next stage? What’s the pipeline?
Wilson: We first use our camera to capture the 12K video, that is the original fisheye lens video and we have then two ways to deal with these videos. We can stitch them in the camera or use a higher algorithm to produce better image quality. We use PC GPU, to perform the super sampling from 12K to 8K. It gets better quality than native 8K.
Then we do some image optimization for better color or better exposure and something. Then we stitch them. Finally, we stream the data to the back end of the ByteDance decode-encode engine and they have some re-encoding of the video and some effects. Then they will do some broadcasting like CDN to the user. The user, they have their Pico Video app, which can reproduce the video and also generate some of what they call 6DOF effects. The users have also some social functions and can enjoy the concert with other people. That’s the pipeline.
Nikk: Tony, I can explain a little bit more about what Wilson was talking about. The first step was about us sending the stream from our camera to them. It’s going from 12K and then downsampling to 8K and stitching it live. What’s the latency from our camera to their back end?
Wilson: We’re using some protocol that will provide less than, we believe, less than 2 seconds. It’s RTMP push streaming and they will do some effects and some encoding and then the total latency will be about 20 seconds or 30 seconds for the end user.
Nikk: From our system, we go with just 2-second latency to their back end. We call our stream “Super 8K” because the final video feed is in 8K, but because it’s filmed in 12K and then downsampled, it’s going to be a higher resolution than the [native] 8K. Then, we go from that to their app, which is really fascinating. As far as I know, nothing exists like it in the West as far as metaverse theater viewing.
[Notice: After this interview has been recorded, Pico has actually announced the release of the Pico Video app also in the West]
It’s really like what I mentioned with the… Hey Wilson, how do they call it? Lite 6DOF? Virtual 6DOF? What was the word they used?
Wilson: [chuckles]
Nikk: I’ll call it 6DOF-ish, 6DOF Lite. It’s still a video, but none of the video feeds are actually in 6DOF. It’s a 180 high-resolution video feed but with your movement in the space, you switch back and forth between the videos.
Also, what ByteDance is doing that I think is really fun is combining the effects of the concert stage with VR. The concert stage as, I guess, you can see in these videos that have been cut in here is this crazy stage with all these LCD screens and everything set up there and the colors change and video plays on the back. Then depending on the colors of the actual video and the actual stage, the virtual space also changes. The virtual metaverse concert hall changes.
In one song, it’s like white and blue, like a sky and paper airplane animation on the back LCD screens. Then while that video’s playing, they have 3D animated paper airplanes that fly up. You can actually see they do so much to actually connect their virtual concert space where you have your avatars and your people with the actual video feed. You really can actually get that feeling they’re connected, they’re one. It’s not this isn’t a virtual theater where I’m seeing the concert happening there, it’s like I’m actually connected to it.
That’s cool, but were you streaming live this concert or you recorded it and then you performed it later?
Nikk: Live. Wilson, you said from the concert to the headset is 10 to 20 seconds?
Wilson: Yes. About 20 or 30 seconds for the total latency.
Nikk: That’s completely live and actually now you can send in comments and emojis and different things on the Pico video platform. Now they’re hooking it up. Actually, there are interactions with the performer. It’s like in Altspace… but Altspace is a person who’s in the virtual space doing a virtual concert, and they can see the emojis from everybody and everything… but now they’re connecting it so that the person doing the concert [in the real world] can see the comments and different things from people watching and also adding in actual interactions.
In one part there is a menu that pops up with a vote for what should be his next song. They planned in advance that everybody would get a vote and then the people in VR could select and actually influence the concert. That message gets back to the performer and then he alters his performance based upon interactions by the many in VR.
So, is there a custom app by Pico or is the concert held in the standard Pico Video experience?
Nikk: That’s the app. Pico video.
Wilson: Pico video.
Makes sense. Are you also using 5G, technically speaking, or is it not necessarily for what you’re doing?
Nikk: It’s wired internet from the concert hall and then to the viewer there… or maybe they’re at home, so they get it via their WiFi or they’re outside and so on 5G… but it definitely, it can run on 4G, but with the 5G would be a perfect connection.
Makes sense. Why did you decide to do a concert with your cameras in 180 VR and not with all CGI, or with 360 videos? Why did you choose this format?
Nikk: Why live action?
In 180 degrees, if I understood correctly.
Nikk: 180 over 360 is just where the technology’s at. In the future of VR film, definitely 360 is going to be king, but with current resolutions, let’s say an 8K 180 video is going to be twice as high resolution as an 8K 360 video because you have 8,000 pixels that are in half of your view instead of all of your view. For that, with the current methods and also VR live streaming in 360, nobody can really do that well, like live stitching and everything.
Stay tuned… a little teaser, FXG is planning on releasing a 360 camera next year that we’ll be able to do all-in-one live streaming over a 5G native connection but that’s not announced yet. I’ve never told any media about that. That’s a little sneak peek for you, Tony [THANKS NIKK!]… but it’s hard. It’s hard to make a good 360 live stream and nearly impossible actually.
That’s why we choose 180 over 360 and live-action over CG because you want to see the star you love. You want to see them. Do you know what I mean? It’s pretty cool if I can see an avatar that’s controlled by Justin Bieber as he did with his metaverse concerts, but you want to see Justin Bieber. Do you know what I mean? It’s like, hey wow, this represents Justin Bieber or this is controlled by Justin Bieber, but if it was a VR video, it’s like, no, this IS Justin Bieber. I just believe that for actually experiencing a concert, it depends.
I think virtual shows and concerts will be super huge for virtual idols like Miku Miku, the Japanese V-idol. For sure it makes sense and it’s awesome because she exists as a virtual being, but for people who exist as physical beings, we want to see them physically.
What was the biggest difficulty you had in creating these concerts? I can’t even imagine the effort and the responsibility you had… what have been your biggest difficulties and how have you solved them?
Nikk: We’ve had so many difficulties. Wilson, talk about the tech. What are some of the challenges to go through here?
Wilson: From a technology standpoint, we met some difficulties. You just mentioned 5G, but in the real case, we used wired internet, because we are filming in a city far from Pico CDN center in Beijing and there are thousands of kilometers to go through. The network is not so ideal and we are using very high bandwidth. We are using hundreds of megabytes to keep the image quality high. There are eight cameras. It means there are eight channels we need to transmit at the same time. Sometimes we found the network was not so stable and we worked with ByteDance to solve that. Finally, we solved it. We have a very stable network, now.
Nikk: How did you solve the network issue?
Wilson: Actually, this is very technical. They set up a local interface in that city to give directly the CDN from that city to Beijing. We got very low latency after that. Before we had a 100 milliseconds delay and after, we got a very low latency of about 3 milliseconds or 4 milliseconds which is very stable. Then after that, the live stream is very stable, but we met some other problems, that had to do with the synchronization of all the channels. They needed to be kept completely in sync.
Nikk: Oh, yes, because you have all of these different video feeds that all need to be perfectly synced because when I am in the virtual space, maybe I’m on the right side of the concert and now I want to move to the left side of the stage, and so the moment I move there, it should be seamless. It’s the music. You’re watching a concert so if it’s off by any time, it’s going to be totally ruined. It all needs to be in perfect sync across everything.
Wilson: Actually, I think it’s not a secret that we communicate with Bytedance servers using some private protocol that it’s a modification of RTMP where we add some information about the timestamp inside the streams. When the streams are processed, this information is kept and used by the headset player which uses the information to keep the stream on the same timeline.
Nikk: We built our own private protocol that contains time information. That’s amazing… I didn’t even know how cool it is!
Wilson: I think it’s ByteDance, they’ve made this protocol. We just follow their instructions, but maybe there are some other ways to solve this problem. We have some other suggestions but maybe we will try them for future events. That’s a difficulty we have.
Nikk: I think from my perspective of it is that on every step of the way, pretty much we met all kinds of new tech challenges because it’s like… What I believe from everything that we’ve seen, like the first 8K live stream we did was the very first one that’s ever been done in China.
I’m not sure about the entire world. I don’t know of one specifically.
It may be the world’s first but definitely, in China, there’s never been an 8K 3D simultaneous live stream. There’ve just been tech demos of TV stations that have done it to show off the tech. But actually, as an event with thousands or tens of thousands of users directly into their homes, that was the first. We had to rebuild a special CDN in the location and use timestamps combined with the video feed… there was just so much from our side as well as from ByteDance’s that needed to be built new or edited from the past to be able to actually just run the event.
How many people attended these events?
Nikk: The statistics haven’t been given of how many people attended the event. But something that you can share is that during the event you have a virtual camera that you can carry and you can use that to take selfies in the event. You can use that to film the band. You can do all these things with it inside of VR, inside of the Pico headset, and then with one button, you can share it directly to your Douyin (Douyin is a Chinese version of Tiktok).
You can see here, this is from Michael Zheng’s event, there are hundreds and hundreds of shares from people that took selfies and videos inside. This next one you can see is from a Chinese, kind of like K-pop, C-pop… I don’t know what they call it… but this band’s name, and again, you can see when their concerts happen and go live, there are so many posts on Douyin.
You can see that there are a lot of people, and I don’t know what percentage of people post, but there are hundreds of people posting so there are definitely a lot of people who are enjoying it and experiencing it. I guess the numbers aren’t shared yet either but from what we know, the number of headsets that are being sold is huge and a huge part of that is because of these events.
Pico totally has a concept that I personally share of VR being so much more than an immersive gaming console and that it’s actually for everybody. That’s why their investments are in these events and so they’ve been pulling in all kinds of people who aren’t traditional VR users because they have these amazing experiences live and recorded of IPs and artists and things that people love.
That’s very, very cool. Compliments for what you’ve done. I want to get back instead to how you did the concert, especially about your camera which was fundamental to shooting the performances. Can you tell me something more about your FM DUO camera?
Nikk: Wilson tell us about more it, what’s this camera?
Wilson: We hope that many professionals will be using our camera. Regarding the technical specs, it’s a dual full frame sensor and it can capture 12K video and it can do embedded stitching and live streaming. Then we also use a feature, we call it super sampling from 12K to 8K, that gives the very best image quality for live streaming or recording. It is very good for professionals to do their production.
We have some software support for recording and live streaming and then you can do the real-time stitching for 12K to 8K and do some image optimization and stitching. We think it’s a very professional machine for those high-end users that want to do live streaming and some other recording or something.
Nikk: The duo full frame sensors are huge because that’s a super high resolution in there. And then to be able to stitch inside the camera, the embedded stitching, there’s a lot of tech parts that really make it shine, that offer such an experience that made ByteDance choose us to do their million dollar productions. So much is used on the stars and on the stage but they make sure every part they have gets the best quality.
When you say 12K, is it 12K for the whole sensor or for every eye?
Wilson: For the sensor. There are two full-frame sensors inside the camera, each can capture 8,000 square pixels at the same time and so we can combine the two lengths together. It’s about 12K by 6K. The resolution, I think, it’s very important for we are filming and also it’s a full frame so we get very good image quality.
What about the frame rate?
Wilson: Frame rate… currently we’re using 30 fps but actually it can record in 8K 60 frames per second and it’s very good for some sports scenes and some other faster movements.
Nikk: We record for ByteDance in these concerts and we do it in 30 frames per second because this way we can record in 12K. Even though the final view on the headset is 8K but because it’s a 12K film downsampled to 8K, it’s going to be higher quality.
In April, I tried the 180 VR camera from Canon. It was a very good product, I think. What do you think is the difference between yours and this product?
Nikk: Great question.
Wilson: Yes, first is that our camera can do live streaming and I can do embedded stitching or outside stitching using a PC, and live streaming is very important for our design. And second, for the image quality, we have done some tests, even in 8K versus 8K, and we believe we have better quality and resolution. If we use 12K sampling to 8K, we will get even better image quality. That’s it. There are some other issues like we have to do some very long-time recording and live streaming. We tested over, for example, we did some tests lasting 72 hours and our camera could do the live streaming, in a very stable way, and…
Nikk: 72 hours straight.
Wilson: Yes.
And didn’t it explode?
Nikk: No. [laughs]
Ah, that’s good then. One thing that I appreciated, for instance, when the Canon guy let me see the camera, was that it seemed to be also usable by someone like me, like a hobbyist or not a professional, anyway. Are you targeting only professionals with the FM DUO or like if I want to use it, then you just give me some instructions, and I’m able to use it?
Nikk: This is for professionals but we have plans in the future to create a model… Currently, this camera sells for almost $20,000 and we have plans for another model coming out that we want to be cheaper and we also want it to be more accessible. Our goal is, for the long term, to be able to make amazing VR cameras so everyone can live stream and record epic VR content but we’re beginning with the very top of the market focusing on ByteDances and brands like this to give them the greatest VR quality, and then we’re going to slowly step by step try to expand it to the rest of the world.
Makes sense. Can you just repeat, in case someone is interested, where people can buy the FM DUO and for what price?
Nikk: Just you can visit our website fxgvr.com. What’s the camera website? We just made a new website specifically about the camera. Check in the description or somewhere, but on fxgvr.com, you can find information, and currently, the camera sells for just under $20,000 US.
Makes sense. It’s a product for professionals and that’s the usual range price for this kind of product. You don’t go doing $1 million production with a $100 smartphone. Before leaving you after this interesting chat, I would like to ask you a few questions about China, since you are there.
Lately, after the COVID, the communication between China and the West has become more complicated because of all the restrictions. I want to ask you two things. The first one is how is different, in your opinion, doing these virtual events in China with respect to doing them in the West. We’re seeing, for instance, Meta is doing some events with streaming of artists. You are doing the same, but in China with Chinese artists. What differences do you see in this?
Nikk: I think one thing that you mentioned… because of COVID, I think anywhere in the world, that’s made the concept of virtual events to be a lot more meaningful, so people are really interested in that and the concept makes sense. It’s like, we don’t know when we will be able to or whatnot. Though in China now, most COVID restrictions are low so people can still go to events but I guess what I would see as the difference is that it’s two brands, basically. The West is Meta, China is ByteDance, and so Meta, Oculus, ByteDance, Pico.
The biggest difference I see is the focus on film. When you see the investment that Meta has done in sponsoring content, sponsoring developers, and things like that, you can see it’s primarily in games and applications. I’m sure they still spend good money doing concerts and things, but the ratio of it is in favor of gaming. Pico, definitely, supports games on its platform, but it’s very clear to them that their belief in the growth of the platform is that video is like the killer app. Whereas, I think, to Meta, video is one thing and I don’t know what their focus is on it, but it’s very clear with ByteDance, that they have a belief that that can take VR to the masses, specifically VR video and VR live streams.
The second question is: what about the Chinese metaverse? Because there is a lot of chat in the West where people have a very confused idea of what is happening. A lot of bullshit is being said, just to be very honest. How do you see the development of the metaverse in China in these years?
Nikk: It’s way crazier than even in the West. They have metaverse conferences. They have metaverse incubators. They have metaverse in government-wide initiatives from like a provincial level of some places but it’s definitely bigger than the West, but I don’t think it’s any more sensical than the West. Like it’s still, everybody’s talking about it, but I think conceptually, it’s so broad and confused. People are all talking about it and they all think that it’s a thing that’s important that they should care about, but do they even understand what it is… is a total question.
I think it’s very similar to the West, where it’s so much hype with a little bit of actual serious projects on it, though around all the smoke and mirrors and everything of just nonsensical hype, I personally believe that this live streaming connected with social VR, I believe that that will be a great metaverse use case.
I know the metaverse is an annoying word, but I could say it works for the social VR use case because it’s all about having these social VR experiences, whether it’s VRChat or Altspace or Horizon or in China, the Oasis or vHome or Pico Video… you have these awesome platforms where you can connect with other people, but it also comes down to what are you going to do in these platforms and video is such a core part of content just in general. We’ve seen if we look traditionally at the internet, video has been a key part of the internet. We look at mobile development, and video has been a key part of it.
We see the metaverse is web 3.0, and that that’s going to be such an important thing but I think a lot of people don’t have video as a core part of that. That’s something that we’ve totally forgotten, and so I think in the metaverse, video is going to be huge and I believe that the metaverse is VR. That’s why I think Pico Video is a metaverse use case. You and a bunch of friends can go to a concert and experience an epic concert and you’re taking selfies of your avatar and you’re sharing them on social media.
To me, that feels like a really strong metaverse experience. That’s becoming huge in China and I think that’s going to influence internationally these amazing experiences. As Pico expands out of China, these experiences are going to be done outside of China. [And in fact, after the interview was recorded, Pico announced that] I think it’s going to start to change the whole world’s perception of what video can be in VR.
Wow. That’s very fascinating. Before leaving you, if you still have something that you want to say to my readers and viewers, you’re free to talk and say whatever you want.
Nikk: Anything else to say, Wilson?
Wilson: I think maybe about Bytedance or some other people in China: there are many people that do some tests, some of their ideas will be successful or very cool, but many others will not be very cool, but they are still testing the best way to experience VR, and I think that’s cool.
Nikk: I love that, I’m not going to even add anything, that there are so many tests going on. Some things are going to succeed, and some things are going to fail, but it’s amazing to be in VR and see people trying new things. Something’s going to stick, and at FXG, we hope to be a part of it, of whatever that looks like in the future.
Wow, and I’m sure that among the things that will succeed there will be you, the cool people of FXG, because I know you for a lot of time, and I know the cool things you’re working on, so I invite everyone to go to your website, so they can see whatever you’ve done. I thank you, Nikk and Wilson, for this amazing chat!
Nikk: Look forward to chatting more. Hit us up with any questions about VR films coming to a metaverse near you! [chuckles]
I totally thank Nikk and Wilson for the time they have dedicated to me! If you liked this post, don’t forget to subscribe to my newsletter to not miss all the future ones 😉
(Image by FXG)