A simple Augmented Reality engine for iOS devices

Absolutely beyond my expectations, I am receiving a lot of questions about the Augmented Reality we implemented in some of our iPhone applications. Honestly, I expected the “Wow!” or “Cool!” from final users and not from other developers, but then, I realized that many of the iOS developers out there do not have enough mathematical background to make those geometrical calculations.

I worked on Augmented Reality many years ago. More precisely, I was working on Wearable Computing, Computer Vision and Digital Image Processing and the Augmented Reality was one of the technologies I investigated for a while. However, that type of Augmented Reality was much more complex than what I am going to show you here. At that time, I had to process in real-time the scene acquired by a video camera with very complex digital image processing algorithms and, using some statistical reasoning, my algorithms had to understand the scene and provide the user with some additional information augmenting her reality.

This kind of algorithms required very complex and heavy computations far beyond what an ARM processor can do today. So, the Augmented Reality you can implement on an iOS device is very simple compared to that. The simplification comes from the a priori knowledge of what you theoretically want to augment and represent on the iPhone screen. For example, if you want to show the location of a subway station with augmented reality, you use some information you already know about it, before (and that’s why a priori knowledge) implementing it. In my previous research, I did not have any a priori information about it. Instead, I had first to understand the reality I was looking at through the camera and then add some useful information related with the recognition.

So, the Augmented Reality iPhone apps currently in the AppStore are in reality a fake version of the Augmented Reality. I mean, you are not really augmenting anything. It is just a sort of different representation of the reality. They do recognize absolutely nothing and, more important, they do not add anything new to what you already know.

In general, the Augmented Reality (AR) is a term for a live direct or an indirect view of a physical, real-world environment whose elements are augmented by computer-generated sensory input, such as sound or graphics. In the specific case of the iOS device, the Augmented Reality uses the location services included in the device: the aGPS and the Magnetometer. You need to do a very simple mathematical exercise to make it work, but I have already done it for you. What I show you here is a very simple model, but you can make it much more complex.

The mathematics behind it is contained in the following picture.

There, the point (x0, y0) represents the position of the iPhone user on the earth (latitude and longitude), while the point (x1, y1) represents the position of the object you want to show on the screen of the user’s device through the camera. Usually, we name this object as a POI (Point Of Interest). You can represent many POIs on the screen. Here, to make it short, I just show you an example for a single POI, then you can develop easily by yourself the multi-POIs case.

Now, since usually the distance between the POI and you is much smaller than the distance between the North Pole and you and the POI and you, you can assume that the two lines connecting the North Pole with the POI and the North Pole and you are parallel. Having said that, you should know that when two parallel lines are intersected by a line the two angles ß are identical as shown in the previous picture. The angle ß represents the deviation of the direction of your view from the North Pole to the object.

So, knowing your location from the iPhone GPS and the POI location (that you know a priori and stored somewhere in the device), you can calculate the angle ß. Again, with some small trigonometric knowledge, you get ß = arctan(a, b). But, ß is also the angle provided by the iPhone magnetometer, when the object is exactly in the middle of the iPhone screen (when you are looking at it through the camera). So, your mathematical equation is already solved. When the computed angle ß coincides with the angle provided by the magnetometer, then you know the object is in the middle of the screen. So, supposing you have the iPhone in portrait mode in front of you, the x coordinate of the object on the screen will have x = 320/2 = 160 px. To simplify the model, I assume the y coordinate is always fixed. Let’s say, y = 480/2 = 240 px. Using some small additional mathematics, you can also use the azimuth of the iPhone to move the object up and down on the screen.

Obviously, to make everything more realistic, you need to move the object left and right on the screen with a deviation that is proportional to the difference between ß and the angle provided by the magnetometer. Now, supposing that the iPhone camera can see 60° (aperture), then the object x coordinate on the screen becomes x = 320/2 + ∂ *320/(60/2), where ∂ is the difference between the computed ß and the orientation obtained with the magnetometer while you are moving the phone around you.

At this point, you have all the elements to build a small testing app. Fire up Xcode and create a view-based project and name it ARNorth. Add the CoreLocation framework to your project and import it to the ARNorthViewController.h file.

In the ARNorthViewController.xib, add a button and title it Start. In the ARNorthViewController.h, create the action -start and connect it to the button.
So, the body of the -start action will contain:

So, when you press the Start button, the GPS starts to provide you with your location and then the image picker is presented on the screen. You remover the camera controls and add a cameraOverlayView the arrowView. This is simply a view on top of which there is an image representing the object (in this case I used an image of an arrow I created in DrawIt).

The location manager starts to provide you locations through the -locationManager:didUpdateToLocation:fromLocation: delegate. This method in a very simple version will contain:

So, as soon as you get the user location, the magnetometer starts to provide the North Pole direction. So, in the -locationManager:didUpdateHeading:, you capture the heading value and make all the calculations as previously explained:

If everything was done correctly, you should have an arrow indicating the North Pole. I intentionally left some values and some assignments as explicit, so that you can understand what I am doing.

In this example, the initial conditions are missed. So, when you launch the app and start the camera, the object is in the middle of the screen. You have to estimate its real position before launching the camera and display it on the screen. Additionally, I did not put intentionally any button to go back to the parent view controller and I did not adjust the camera view to fill in the screen. I leave all these things to you.

As said before, you can make this simple model more complex. For example, using the accelerometer you can rotate the phone in landscape mode. You can also add an azimuth to make the object position dependent from its angle with the ground. But again, it’s just a small mathematical exercise.

Keep coding,




(Visited 13 times, 1 visits today)