phone augmented reality rules

10 rules to keep in mind when developing your ARKit/ARCore AR application

Smartphone based Augmented Reality is the big trend of the moment, because thanks to frameworks like Apple ARKit and Google ARCore, it is possible for hundreds million users to experience augmented reality without the need of expensive AR glasses like HoloLens. In my opinion, this hype is a bit exaggerated, since AR on the smartphone is far inferior than AR on glasses, but I still think that we’re talking about a very cool technology.

In these days there’s a true gold rush towards this kind of AR and lots of developers are trying to do AR applications to be the first on the market. Most of these demos still don’t convince me: they’re too rough, useless or use AR where just a standard screen would be better.

Yesterday I found myself discussing with my friend Max about this, about what kind of app would be interesting to have in AR on the smartphone. We soon realized that there are some points to take in count in this process and I want to discuss them with you.

1. User must choose to live in AR

One of the biggest problems of smartphone based augmented reality is that it requires the user to choose to live inside AR. He/she is all the time outside AR and at a certain point, he/she must decide to launch the AR app to live an AR experience. This is an issue for two reasons:

  • It introduces friction. Humans are lazy by default and every time they have to decide to do something, this is a big barrier towards doing that thing. Smartphones are so addictive because they induce us to use them through notifications that with just a click take us to an app that gives us ephemeral pleasure. AR apps would require a similar mechanism, but anyway every time launching the app and performing the scan of the environment to use AR is a killer. With glasses we could have AR notifications and AR apps opening very easily;
  • It can’t be contextual. If I wear an AR glass while walking in the street, I could see an AR notification telling me that the shop I’ve on the right has some product that I would really love and show me some info about it. Considering the fact that I have my smartphone in the pocket all the time, I can’t see any AR notifications about the space around me. An AR glass can also analyze all the space around me since its cameras look what I’m looking at all the time, while the phone (that is in the pocket most of the time), can’t do that for sure.

So, with smartphone based AR, you have to take in count that the user has the phone in his/her pockets and then he must decide to take it out and use the AR experience. He doesn’t have the phone in front of his face the whole time, so every app regarding contextual info in AR are a no-no at the moment. You must develop an app where at a certain moment, in a certain place, the user decides to live AR.

2. Application must be useful or funny

As I’ve said in the above point, the app must convince a lazy human to take out its phone and launch and init an AR app. To trigger such a behavior, you must create something that is useful or entertaining. Just putting a 3D model on a plane in AR is cool, but lasts very little. Consider that when AR will be released to the public, there will be a huge hype due to the “wow effect”, but after that, people will get used to seeing AR apps and just showing a 3D model won’t be cool anymore. You need to give them a reason to use your app.

Using tricks like notifications or achievements will, of course, increase your app lifetime.

3. Mix of the virtual and the real must have sense

Augmented Reality has sense only when you are mixing the virtual elements with the real environment and this mix has an added value with regard to the reality alone or to the virtual elements alone. Take for instance an app about an AR TV. Looking through the screen of the smartphone and seeing a virtual TV on my wall that plays Netflix videos is absolutely useless since this adds nothing to my movie experience. Furthermore, it is uncomfortable, because I would see the videos at a terrible resolution. It has more sense seeing the videos directly on the smartphone screen, with a 2D app.

This ARKit demo about virtual menus is instead very cool.

The reason is that it makes you see the food you’re eating directly into your dish, so it makes you preview exactly what will be happening if you order that food. Seeing just a 2D photo or 3D model on the smartphone wouldn’t give you the same sense of realism. The real context enhances a lot the virtual elements.

One of the coolest demos I’ve tried on HoloLens has been RoboRaid. This game is awesome for various reasons, but one of them is that it pretends that the robo-monsters are coming out directly from YOUR walls. You see holes in your walls. This game exploits your environment and mixes the real and the virtual. This ARCore demo makes something similar, making little elements coming out directly from your real floor.

So, if the real world doesn’t add any context to the virtual elements and the virtual elements do not add anything to the real world, then AR is not the right choice for your application.

4. UX has not to require hands (maybe)

Remember that the user has to hold a phone with one hand or a tablet with two hands. This means that he won’t be hands-free. A cooking tutorial made with Hololens, where you see a famous cook preparing awesome dishes next to you so you can emulate him is fantastic; with ARKit is impossible, because the user can’t cook while holding its phone.

So, the user has limited interactions with the real world: it has at maximum one free hand. This means that most probably he can’t do anything else while using the AR app.

This holds true if and only if the user is not wearing any of the upcoming smartphone-based AR glasses like Prism or Aryzon. If this is the case, the user has both hands free and a lot more AR experiences are enabled. The problem is that these glasses have a very nerd look and I guess that not many people will have it in the first times… and surely they will be used only indoor.

5. Single-playerness

As far as I know, there’s no multiplayer in AR apps. I mean, there’s not the possibility to have two ARCore devices in the same room playing in the same space the same experience together. With HoloLens this is possible (for instance in Sketchup Viewer), thanks to complicated algorithms of “anchors sharing”. To perform the sharing of the same real space, the two devices have to map the environment and then share some sort of descriptors of the place where the virtual actions are happening, so that each device can identify what is the frame of reference of the other device. (Just to make things simple: the first device should tell the other: I’m playing at a center of a brown table. The other device detects where a brown table is and sets its center as the playground center, too). This is offered by no AR API. This means that at the moment AR experiences are single player, keep that in mind.

AR sharing
This scene is possible with HoloLens, but still not possible with your smartphone (Image by Microsoft)
6. No spatial memory

As far as I know (I’ve also Googled around), since these frameworks have no way of saving the descriptors of the environments around them, there’s no way to save the position of a point in the space. I mean, if I launch a game on a table and put a virtual object onto it, the next time the system has no way of remembering where was that point that the virtual object was put onto, so next time you will launch the experience it won’t be put exactly in the same place. Every session is independent from the others. Surely, knowing that the table has an identifiable shape, it is possible to somewhat guess a similar position, but it is a work to be done by hand. You can’t save real world positions for future usages.

Design, if possible, an experience that can be re-started from scratch each time.

7. Exploitation of a single (planar) place

ARCore and ARKit are able to scan the environment, get some feature points and detect where is the planar surface where the game will happen. This planar surface may be a wall or a table, for instance. So, the game must happen onto one plane, or any way relative to this main plane. If you see, most ARKit demos are based on the floor or on a table. So, think about an experience that has sense relative to a flat floor.

IKEA experience is a great example in this sense. Interior design is a great application of ARKit, since it is a perfect mix of real and virtual and happens on a planar surface.

Notice that there are exceptions to this rule, but at this moment may be safer to play this way (tracking is more stable).

8. Mix of multiple frameworks

You can mix ARKit or ARCore with other frameworks to obtain very interesting projects. For instance, the guy mixing Apple Maps with ARKit has used geolocation + AR. ARCore could be used with Vuforia, that offers object recognition features, to create for instance a soccer game on a fast food table where you use two cans as the poles. You can also use AI, Big Data analysis, etc… Free your fantasy: the more you mix other frameworks with ARKit, the more powerful the augmented app can be.

9. Remember that user has to keep the phone in front of his face

Remember that keeping the phone up with the hands can be tiresome in the long run, so prefer shorter experiences (that are to be preferred also because AR drains the battery a lot). Furthermore, the user can’t walk down the street looking like a zombie to use your AR app, so prefer indoor applications. If you do an outdoor application, design it so that it is usable in dedicated places. For instance, an app that you take out when you’re in front of an important monument of a city and shows you additional info in AR has sense because you walk freely and then just take the phone out in front of the monument, when you would have taken out the phone anyway to shoot a photo.

I know, you’re thinking “But with Pokemon Go, people were happy with walking like zombies”. Yes, but you’re not Nintendo and don’t have a powerful brand like Pokemon. 😀

Having the phone in front of his face means also that the FOV of the user is very limited.

10. Keep on the lights

Since these frameworks rely on the rear RGB cameras of the smartphones, they can’t work with low lighting. Of course, things could change if a future iPhone will reveal rear depth cameras (iPhone X unluckily has not implemented them) and if more Tango phones will be out, but at the moment, if you plan to develop a horror experience like Face Your Fears, I’ve some bad news for you.


So, these are some rules to take in mind when designing your next AR masterpieces. Of course, things are going to change when the AR frameworks will evolve. Anyway, if you have something more to add, let me know in the comments!


Disclaimer: this blog contains advertisement and affiliate links to sustain itself. If you click on an affiliate link, I'll be very happy because I'll earn a small commission on your purchase. You can find my boring full disclosure here.

Releated

vps immersal visual positioning system

Visual Positioning Systems: what they are, best use cases, and how they technically work

Today I’m writing a deep dive into Visual Positioning Systems (VPS), which are one of the foundational technologies of the future metaverse. You will discover what a VPS service is, its characteristics, and its use cases, not only in the future but already in the present. As an example of a VPS solution, I will […]

valve deckard roy controllers

The XR Week Peek (2024.12.02): Valve Roy Controllers 3D models’ leak, Black Friday VR deals, and more!

Happy Thanksgiving weekend to all my American friends! We don’t have Thanksgiving in Italy, but I know it’s a very important celebration in the US, Canada, and a few other countries, so I hope all of you who celebrated it had a great time with your family.  To all the others who did not participate in […]