After the latest Unite event, Unity has released in Open Beta the tools to develop applications for the Apple Vision Pro. The development packages are usable only by people having Unity Pro or Enterprise, but the documentation is publicly available for everyone to see.
At VRROOM, we have a Unity Enterprise subscription, so I’ll be able to get my hands dirty on the SDK pretty soon... hoping to make for you my classic tutorial on how to develop an application with a cube for this new platform. For now, I’ve read the available documentation and I think it’s already worth telling you what are some very interesting tidbits that I’ve learnt about Apple Vision Pro development in Unity.
General Impressions
Before delving into the technical details, let me give you some overall impressions that can be understandable also by all of you who are not developers. There is also some interesting news about Vacation Simulator in it 😛
Developing for Vision Pro
It seems from the documentation that the Unity and Apple teams worked together to make sure that the development for this new platform was as close as possible to developing for other platforms. Unity is a cross-platform engine, and one of the reasons why it got so popular is because theoretically, once you have created your game for a platform (e.g. PC), it can be built and deployed on all other platforms (e.g. Android, iOS). We Unity developers know that it is never 100% this way, usually, you need some little tweaks to make things work on all platforms, but the premise is almost true. This is an advantage not only for the developer, who can do the hard work only once but also for the platform holders: if developing for the Vision Pro required to re-write applications from scratch, many teams wouldn’t have the resources to do that and would skip Vision Pro, making the Apple ecosystem poorer.
That’s why it is fundamental that the development for a new platform shares some foundations with the development for the other ones. In fact, also when developing for Apple, you use the same basic tools you use on other XR platforms: keywords like URP, XR Interaction Toolkit, New Input System, AR Foundation, and Shadergraph should be familiar to all XR devs out there. And this is very good.
I also have to say that when reading the various docs, many things I read reminded me of the times I developed an experience on the HoloLens 1: I think that Apple took some inspiration from the work that Microsoft did when designing its SDK. This also made me realize how much Microsoft was ahead of its time (and its competitor) with HoloLens back in the days, and how much expertise it has thrashed away by shutting down its Mixed Reality division.
Types of experiences
On Apple Vision Pro, you can run three types of applications:
- VR Experiences
- Exclusive MR experiences (The experience that is running is the only one running at that moment)
- Shared MR experiences (The experience that is running is running simultaneously to others)
- 2D Windows (The experience is an iOS app in a floating window)
Developing VR experiences for Apple Vision Pro is very similar to doing that for the other platforms. In this case, the create-once-deploy-everywhere mantra of Unity is working quite well. And this is fantastic. Creating MR experiences, instead, has many breaking changes: the foundation tools to be used are the same as other MR platforms, but the actual implementation is quite different. I think that porting an existing MR experience from another platform (e.g. HoloLens) to Vision Pro requires some heavy refactoring. And this is not ideal. I hope Apple improves on this side in the future.
Documentation and forums
Unity and Apple have worked together to release decent documentation for this new platform. There is enough available to get started. And there is also a dedicated forum on Unity Discussions to talk about Vision Pro development. Lurking around the forum, it is possible to get some interesting information. First of all, it’s interesting to notice that the first posts were published on July, 17th and they mention the fact that information contained there could not be shared outside. This means that the first partner developers already got the private beta 4 months ago: Unity is slowly rolling out the SDK to developers. First of all, it was distributed only to partners, now only to Pro subscribers, and probably later on, it will be opened to everyone. This is a normal process: SDKs are very difficult to build (I’m learning this myself), so it’s important to control the rollout, giving them to more people only when they are more stable.
On the forums, it is possible to see some known names of our ecosystem, because of course, all of us in the XR field want to experiment with this new device. One name that caught my eye, for instance, is a developer from Owlchemy Labs, who seems to be making some internal tests with Vacation Simulator and Vision Pro (which doesn’t guarantee the game will launch there, of course, but… gives us hope). I think all the most famous XR studios are working already on this device.
Running the experiences
Apple already opened up the registrations to receive a development device so that developers can start working on it. Devkits are very limited in number, so I think that for now they are being given only to Apple partners and to the most promising studios. In the post from Owlchemy above, the engineer mentions tests on the device, so it seems that Owlchemy already has a device to test on. Which is understandable, since they are one of the best XR studios out there.
All of us peasants that have not received a device yet, can do tests with the emulator. Apple has distributed an emulator (which runs only on Mac, of course) so that you can run your experience in this simulator and test its basic functionalities. Emulators are never like the real device, but they are very important to test many features of the application anyway. When the application works on the emulator, the developer can ask Apple to attend one of the laboratories where you can have one full day to test a prototype on the device, with Apple engineers in the same room ready to help with every need.
SDK Prerequisites
After the general introduction, it is now time to start with a more technical deep dive. And the first thing to talk about is the prerequisites when developing for SpatialOS.
These are the requirements to develop for Apple Vision Pro:
- Unity 2022.3 LTS
- A Mac using Apple Silicon (Intel-powered Macs will be made compatible later)
- XCode 15 Beta 2
As for the Unity features to use:
- URP is strongly recommended. Some things may also work with the Standard Rendering Pipeline, but all the updates will be made with URP in mind
- Input System Package, i.e. the New Input System to elaborate input
- XR Interaction Toolkit to manage the foundations of the XR experience
These latest requirements are in my opinion a very reasonable request. If you are developing an experience for Quest, most probably you are already using them all as your foundation (and for instance, we at VRROOM have exactly based our application already on them).
Unity + Apple runtime
When a Unity experience is run on the Vision Pro, there is an integration between what is offered by the game engine and what is offered by the OS runtime. In particular, Unity provides the gameplay logic and the physics management, while the Apple runtime provides access to tracking, input, and AR data (i.e. the passthrough). This relation becomes even more important to know when running an MR experience because in that case, Unity becomes like a layer on top of RealityKit (it is not exactly like that, but it is a good way of visualizing it) and this translates in a lot of limitations when creating that kind of applications.
Input management
Input detection happens through the XR Interaction Toolkit + New Input System, so using the tools we Unity XR devs already know very well. Some predefined Actions are added to specify interactions peculiar to the Apple Vision Pro (e.g. gaze + pinch).
Applications on Vision Pro do not use controllers, but just the hands. According to the documentation, when using the XR Interaction Toolkit, the system can also abstract the fact that hands are being used, and simply work with the usual hover/select/activate mechanism that we have when using controllers. I would like to verify this with an actual test, but if that were the case, it would be amazing, because it would mean that most of the basic interactions when using controllers (e.g. pointing at a menu button and clicking it) would work out of the box using the Vision Pro and hand tracking without any particular modification.
Apart from detecting system gestures (e.g. pinch + gaze) via the Input System, or using the XR Interaction Toolkit to abstract high-level interactions, there is a third way through which input can be leveraged. This is the case with the XR Hands package, which provides cross-platform hand tracking. At the low level, hands are tracked by ARKit, and the tracking data is abstracted by XR Hands, which so provides the developer access to the pose of all the joints of both hands. Apple states that this is how Rec Room was able to give hands to its avatars on Vision Pro.
Eye tracking
Apple Vision Pro integrates high-precision eye-tracking. But for privacy reasons, Apple prevents the developer from having access to gaze data. The only moment the developer has access to the gaze ray is the frame the user looks at an item and pinches it (and only when the application is run in “unbounded” mode).
Even if Apple restricts access to eye-tracking data, it still lets you use eye-tracking in your application. For instance, the gaze+pinch gesture is abstracted as a “click” in your experience. And if you want to hover objects based on eye stare, there is a dedicated script that does that automatically for you: putting this script on an object with a collider will make sure that the element will be automatically highlighted by the OS when looked by the eyes of the user. I’m a bit puzzled by how this automatic highlight works on the object materials, and I will investigate it when I do more practical tests (hopefully one day also with the real device)
Foveated rendering
Apple mentions Foveated Rendering as one of the ways the Vision Pro manages to deliver experiences that look so good on the device. I would add that with that huge screen resolution, having foveated rendering is a necessity not to make the GPU of the device melt 🙂
For now, Apple only talks about Fixed Foveated Rendering (also called Static Foveated Rendering), which is the same used by the Quest: with FFR, the central part of the displays are rendered with the maximum resolution, while the peripheral ones with a lower one. FFR assumes that the user mostly looks in front of him/her with the eyes. Considering the high cost of the device and the quality of its eye tracking, I suppose that in the future they will switch to the better “dynamic” foveated rendering, that makes the device render at the maximum resolution exactly the part of the screen you are looking at. Dynamic foveated rendering is better because with FFR you notice the degradation of the visuals when you rotate your eyes and you look at the screen periphery.
AR tracking features
Apart from eye tracking and hand tracking, Vision Pro also offers image tracking. I found a mention of it in one of the various paragraphs of the current documentation. Image tracking is a very popular AR strategy that lets you put some content on top of some pre-set images that are identified as “markers”. In this mode, the system can detect the position and rotation of a known image in the physical world, so 3D objects can be put on top of it. It is one of the first forms of AR, which was made popular by Vuforia and Metaio.
Developing VR Immersive experiences
If you are already using the foundations that I specified above, porting your VR application to Vision Pro is rather easy. Unity runs VR experiences on Vision Pro running directly over Metal (for rendering) and ARKit (for eye/hands/etc tracking).
The only thing that is needed to run your experience on Vision Pro is to install the VisionOS platform and specify to run the experience on top of Apple VisionOS in the XR Plugin Management. This is coherent with what we already do on all the other platforms.
The only difference with the other platforms is that like with everything Apple in Unity, you do not build directly the executable, but you build an XCode project through which you can build the final application.
This is one of the tidbits that reminded me of HoloLens development: to build a UWP application, you had to build a Visual Studio solution through which to build the final application for the device.
There are a few limitations when making VR applications for the Vision Pro:
- You have to check the compatibility of your shaders with Metal
- You have to recompile your native plugins for VisionOS and pray that they work
- You have to use single-pass instanced rendering
- You have to make sure that there is a valid depth specified in the depth buffer for every pixel, because this is used for the reprojection algorithms on the device (something that also Meta does). This means that all the shaders should contribute to writing to the depth buffer. This can be a problem with standard skyboxes because the Skybox is usually rendered “at infinity” so has a zero value on it. The Unity team already made sure that all the standard shaders, including the Skyboxes ones, write such a value. For your custom shaders, you have to do the work yourself
- You have to check the compatibility of everything you are using in general
All of this means that porting a VR app to Vision Pro should be rather trivial. Of course, I expect many little problems because we are talking about a new platform, with a new beta SDK, but in the long term, the process should become smooth.
Developing MR Immersive experiences
Developing mixed reality experiences on the Vision Pro is instead much more complex than the VR case, and may require heavy refactoring of an existing application.
The reason for this is that mixed reality applications run on top of RealityKit. It is not Unity communicating directly with the low-level runtime like in the VR case but is Unity working on top of RealityKit, so every feature should be translated into RealityKit and can not be supported if RealityKit does not support it. This is in my opinion a big issue, and I hope that the situation will change soon because it is an enormous limitation for the development of cross-platform mixed reality experiences.
There are two types of MR experiences:
- Bounded: a bounded experience is an experience that happens in a cubic area of your room. The experience just runs in its bounds, and so it can run together with other experiences that are in your room, every one of them inside their own little cube. You can imagine Bounded experiences as widgets in your room. They have limited interactivity options.
- Unbounded: an unbounded experience happens all around you, exploiting the full power of AR/MR. Of course, only one unbounded experience can run at a time, but it can have its own bounded widgets running with it. Unbounded experiences are the classical MR apps and support all forms of input.
This distinction also reminds me a lot of HoloLens times because it was exactly the same: you could run many 2D widgets together, but only one immersive 3D experience at a time.
Whatever experience you have to create, you have not only to install the VisionOS platform but also the Polyspatial plugin, which guarantees that your application can run on top of RealityKit. And the Polyspatial plugin, as I mentioned above, has a looooooooooot of restrictions. Some of them appear super-crazy at first look: even the standard Unity Camera is not supported on this plugin!
After having read more of the documentation, I realized that many of the standard scripts do not work because you have to use the ones provided by Polyspatial. For instance, instead of the standard Camera, you have to use a script called VolumeCamera. The same holds for lighting and baking: some of the features related to baking and shadows should be implemented with dedicated scripts. That’s why I said that porting to this platform is a lot of work: many foundational scripts that are used on every other platform do not work on this one and vice-versa.
And it is not only a matter of scripts: also not all the shaders are supported. Shaders should be translated by Unity into MaterialX so that they can be used by RealityKit, but RealityKit does not support all the features that Unity does. The basic standard shaders have already been made compatible by the Unity team, but for instance, there is no support for custom ShaderLab shaders. You can only make custom shaders via ShaderGraph (sorry, Amplify fans), and even there, not all the ShaderGraph nodes are supported.
I’m not going to write here all the restrictions (you find them in the docs), but it suffices you to say that of the whole documentation about developing for VisionOS, there are 2 pages about VR development, and maybe 10 about Polyspatial. This shows you how much it is more complicated to get used to this new development environment.
Development workflow (develop, build, test)
Talking about how to develop an experience for the Vision Pro, there are some other details to add:
- Unity provides a template project through which it is possible to see an example of a working project set up correctly for all the major targets: VR, bound MR, unbound MR, etc…
- There is a very cool Project Validator, which flags with a warning sign inside your project all components that have been used but are not compatible with Polyspatial. This is very handy to notice issues even before trying to build the application. I think all platforms should have something like that
Regarding building and testing:
- VisionOS applications in Unity support Play Mode (of course), so you can press the play button to do some preliminary tests in Editor. This anyway just tests the logic of the application in the editor, which is just a very basic test. There is a cool feature that lets you record play mode sessions, so that you can re-play them without having to provide again the same inputs as last time… this is very handful for debugging
- If you want to test on the device, but without building, you can use a feature called “Play To Device” which performs the remote rendering of the application on your computer and streams the visuals to your Vision Pro headset or emulator. The headset provides the tracking and visualization, but the logic and rendering are handled in Unity. It’s a bit like when you use Virtual Desktop to play VR games streamed from your PC to your Quest. Play To Device is a good hybrid test, but of course is not a full test because the application is still run on your Mac, in the safe Unity environment. The real runtime is where usually you spot lots of new issues. But it is still very handy to use this feature. I’m telling you for the nth time that this software reminds me of HoloLens: Microsoft had this feature for HoloLens 1 in Unity and was called something like Holographic Remoting. I remember it being super-buggy, but still saved a lot of time, because building your project to Visual Studio (talking about HoloLens, here would be XCode), rebuilding it, and deploying it to the peripheral would take literally ages
- When the application has been tested enough in the editor, you can build it. Building it requires building the application for the VisionOS platform, which is flagged as “experimental”, meaning that Unity suggests now against building everything with it that could go into production (and there is no risk of that happening since the Vision Pro has not been released yet). A build for VisionOS is an XCode project. The developer has then to take the XCode project, build it in XCode, and deploy it to either the emulator or the device
- If you don’t have a Vision Pro or anyway you have it but you don’t want to waste ages to deploy the built application to it, you can test the application in the Vision Pro simulator. The emulator is very cool (e.g. lets you try the application in different rooms), but of course has limitations, because it is an emulator. One current big limitation is that some ARKit tracking data is not provided, so some objects can not be put in the right place as we would like to. Testing on the device is the only way to make sure that things actually work.
Further References
Some important links to get started with Vision Pro development are:
- https://discussions.unity.com/t/welcome-to-unitys-visionos-beta-program/270282
- https://docs.unity3d.com/Packages/com.unity.polyspatial.visionos@0.6/manual/index.html
- https://developer.apple.com/videos/play/wwdc2023/10093
- https://create.unity.com/spatial
Final commentary
I think that it’s cool to have finally some tools to play around for Apple Vision Pro development. Of course, this SDK is still at its beginning, so I expect it to be full of bugs and with many features missing. But we devs can put our hands on it and start creating prototypes and this is great! I’m excited and I can’t wait to get my hands dirty with it in the next weeks! And you? What do you think about it? Are you interested in working on it, too? Let me know in the comments!
(Header image created using images from Unity and Apple)