Friday, 1 January 2010

Augmented Reality on the iPhone - how to

The sample app for this post is available here

When Apple released the 3.1 update to the iPhone operating system they added some extra properties to the UIImagePickerController allowing you to add your own camera overlay and hide the camera controls. Before this developers had to dive into the UIView hierarchy and hack things around as detailed here. Fortunately the new API is a lot simpler and quite a few applications have been released that take advantage of this to do some Augmented Reality.

Unfortunately one thing that is still lacking is access to the real time video feed from the camera. This limits what you can currently do with the iPhone in terms of real time video processing.

There have been various attempts to access the camera, unfortunately they all seem to fall outside of the public SDK so using them in an app destined for the app store is not possible.

However, there was a recent announcement from Apple that they would be allowing applications to use the UIGetScreenImage API call which has previously been a private function. This opens up some possibilities for accessing the real time video feed. Unfortunately the function is an all or nothing screen grab - which makes it a bit difficult to draw data on top of the camera view and access the real time feed.

Fortunately as you can see from the screen shot and the video you can still do some cool stuff. I've used it in my App Sudoku Grab to great effect.

If you look carefully at the screenshot you'll see that I'm actually drawing to the screen using a checkerboard pattern - this gives a good enough image for the user to see, but still allows enough of the camera preview image to show through to be usable. Hopefully this blog post will provide enough information to get you started on implementing you own Augmented Reality app. I've attached a link to a sample application at the end of the post.

So, how does it all work?

The first thing that we need to do any useful image processing is a way to get at the pixels of an image. These two utility functions here give you that:
  1. Image *fromCGImage(CGImageRef srcImage, CGRect srcRect) {  
  2.  Image *result=createImage(srcRect.size.width, srcRect.size.height);  
  3.  // get hold of the image bytes  
  4.  CGColorSpaceRef colorSpace=CGColorSpaceCreateDeviceGray();  
  5.  CGContextRef context=CGBitmapContextCreate(result->rawImage,    
  6.          result->width,   
  7.          result->height,   
  8.          8,   
  9.          result->width,   
  10.          colorSpace,   
  11.          kCGImageAlphaNone);  
  12.  // lowest possible quality for speed  
  13.  CGContextSetInterpolationQuality(context, kCGInterpolationNone);  
  14.  CGContextSetShouldAntialias(context, NO);  
  15.  // get the rectangle of interest from the image  
  16.  CGImageRef subImage=CGImageCreateWithImageInRect(srcImage, srcRect);  
  17.  // draw it into our bitmap context  
  18.  CGContextDrawImage(context, CGRectMake(0,0, result->width, result->height), subImage);  
  19.  // cleanup  
  20.  CGContextRelease(context);  
  21.  CGColorSpaceRelease(colorSpace);  
  22.  CGImageRelease(subImage);  
  23.  return result;  
  24. }  
  25.   
  26. CGImageRef toCGImage(Image *srcImage) {  
  27.  // generate space for the result  
  28.  uint8_t *rgbData=(uint8_t *) calloc(srcImage->width*srcImage->height*sizeof(uint32_t),1);  
  29.  // process the greyscale image back to rgb  
  30.  for(int i=0; i<srcimage->height*srcImage->width; i++) {     
  31.   // no alpha  
  32.   rgbData[i*4]=0;  
  33.   int val=srcImage->rawImage[i];  
  34.   // rgb values  
  35.   rgbData[i*4+1]=val;  
  36.   rgbData[i*4+2]=val;  
  37.   rgbData[i*4+3]=val;  
  38.  }  
  39.  // create the CGImage from this data  
  40.  CGColorSpaceRef colorSpace=CGColorSpaceCreateDeviceRGB();  
  41.  CGContextRef context=CGBitmapContextCreate(rgbData,   
  42.       srcImage->width,   
  43.       srcImage->height,   
  44.       8,   
  45.       srcImage->width*sizeof(uint32_t),   
  46.       colorSpace,   
  47.       kCGBitmapByteOrder32Little|kCGImageAlphaNoneSkipLast);  
  48.  // cleanup  
  49.  CGImageRef image=CGBitmapContextCreateImage(context);  
  50.  CGContextRelease(context);  
  51.  CGColorSpaceRelease(colorSpace);  
  52.  free(rgbData);  
  53.  return image;  
  54. }  
  55. </srcimage->  
The first function takes a CGImage and a region of interest and turns it into bytes that represent the pixels of the image. The second function reverses the process and will give you a CGImage from raw bytes. To make life a bit simpler I'm using a structure that packages up information about the raw image data:
  1. typedef struct {  
  2.  uint8_t *rawImage; // the raw pixel data  
  3.  uint8_t **pixels; // 2D array of pixels e.g. use pixels[y][x]  
  4.  int width;  
  5.  int height;  
  6. } Image;  
  7.   
  8. Image *createImage(int width, int height) {  
  9.  Image *result=(Image *) malloc(sizeof(Image));  
  10.  result->width=width;  
  11.  result->height=height;  
  12.  result->rawImage=(uint8_t *) calloc(result->width*result->height, 1);  
  13.  // create a 2D aray - this makes using the data a lot easier  
  14.  result->pixels=(uint8_t **) malloc(sizeof(uint8_t *)*result->height);  
  15.  for(int y=0; y<result->height; y++) {  
  16.   result->pixels[y]=result->rawImage+y*result->width;  
  17.  }  
  18.  return result;  
  19. }  
  20.   
  21. void destroyImage(Image *image) {  
  22.  free(image->rawImage);  
  23.  free(image->pixels);  
  24.  free(image);  
  25. }  
  26. </result->  
Normally I would write something like this in C++ - but to keep things simple and allow the use of standard Objective-C I've stuck to straight C for this demo.

We can now use these classes to start doing something useful. The first thing we are going to need is a view that can draw using the checkerboard mask.
  1. - (id)initWithFrame:(CGRect)frame {  
  2.     if (self = [super initWithFrame:frame]) {  
  3.   // create the mask image  
  4.   Image *checkerBoardImage=createImage(self.bounds.size.width, self.bounds.size.height);  
  5.   for(int y=0;y<checkerboardimage->height; y+=2) {  
  6.    for(int x=0; x<checkerboardimage->width; x+=2) {  
  7.     checkerBoardImage->pixels[y][x]=255;  
  8.    }  
  9.   }  
  10.   for(int y=1;y<checkerboardimage->height; y+=2) {  
  11.    for(int x=1; x<checkerboardimage->width; x+=2) {  
  12.     checkerBoardImage->pixels[y][x]=255;  
  13.    }  
  14.   }  
  15.   // convert to a CGImage  
  16.   maskImage=toCGImage(checkerBoardImage);  
  17.   // cleanup  
  18.   destroyImage(checkerBoardImage);  
  19.     }  
  20.     return self;  
  21. }  
  22.   
  23. - (void)drawRect:(CGRect)rect {  
  24.  // we're going to draw into an image using our checkerboard mask  
  25.  UIGraphicsBeginImageContext(self.bounds.size);  
  26.  CGContextRef context=UIGraphicsGetCurrentContext();  
  27.  CGContextClipToMask(context, self.bounds, maskImage);  
  28.  // do your drawing here  
  29.   
  30.  ////////        
  31.  UIImage *imageToDraw=UIGraphicsGetImageFromCurrentImageContext();  
  32.  UIGraphicsEndImageContext();  
  33.   
  34.  // now do the actual drawing of the image  
  35.  CGContextRef drawContext=UIGraphicsGetCurrentContext();  
  36.  CGContextTranslateCTM(drawContext, 0.0, self.bounds.size.height);  
  37.  CGContextScaleCTM(drawContext, 1.0, -1.0);  
  38.  // very important to switch these off - we don't wnat our grid pattern to be disturbed in any way  
  39.  CGContextSetInterpolationQuality(drawContext, kCGInterpolationNone);  
  40.  CGContextSetShouldAntialias(drawContext, NO);  
  41.  CGContextDrawImage(drawContext, self.bounds, [imageToDraw CGImage]);  
  42.   
  43.  // stash the results of our drawing so we can remove them later  
  44.  if(drawnImage) destroyImage(drawnImage);  
  45.  drawnImage=fromCGImage([imageToDraw CGImage], self.bounds);   
  46. }  
  47. </checkerboardimage-></checkerboardimage-></checkerboardimage-></checkerboardimage->  
The line of code that does the clever stuff is here:
  1. CGContextClipToMask(context, self.bounds, maskImage);  
That tells core graphics to use our checkerboard image as a clipping mask. As we've only set alternate pixels in the mask this will have the effect of filtering our drawing commands so they only show up on alternate pixels. You might be wondering why I'm drawing to an image and then drawing that to the screen - we'll be making use of the image in a bit.

Now in our view controller where we launch the image picker we can use this view as the camera overlay:
  1. -(IBAction) runAugmentedReality {  
  2.  // set up our camera overlay view  
  3.    
  4.  // tool bar - handy if you want to be able to exit from the image picker...  
  5.  UIToolbar *toolBar=[[[UIToolbar alloc] initWithFrame:CGRectMake(0, 480-44, 320, 44)] autorelease];  
  6.  NSArray *items=[NSArray arrayWithObjects:  
  7.      [[[UIBarButtonItem alloc] initWithBarButtonSystemItem:UIBarButtonSystemItemFlexibleSpace  target:nil action:nil] autorelease],  
  8.      [[[UIBarButtonItem alloc] initWithBarButtonSystemItem:UIBarButtonSystemItemDone  target:self action:@selector(finishedAugmentedReality)] autorelease],  
  9.      nil];  
  10.  [toolBar setItems:items];  
  11.  // create the overlay view  
  12.  overlayView=[[[OverlayView alloc] initWithFrame:CGRectMake(0, 0, 320, 480-44)] autorelease];  
  13.  // important - it needs to be transparent so the camera preview shows through!  
  14.  overlayView.opaque=NO;  
  15.  overlayView.backgroundColor=[UIColor clearColor];  
  16.  // parent view for our overlay  
  17.  UIView *parentView=[[[UIView alloc] initWithFrame:CGRectMake(0,0,320, 480)] autorelease];  
  18.  [parentView addSubview:overlayView];  
  19.  [parentView addSubview:toolBar];  
  20.    
  21.  // configure the image picker with our overlay view  
  22.  UIImagePickerController *picker=[[UIImagePickerController alloc] init];  
  23.  picker.sourceType = UIImagePickerControllerSourceTypeCamera;  
  24.  UIImagePickerControllerSourceTypePhotoLibrary;  
  25.  // hide the camera controls  
  26.  picker.showsCameraControls=NO;  
  27.  picker.delegate = nil;  
  28.  picker.allowsImageEditing = NO;  
  29.  // and put our overlay view in  
  30.  picker.cameraOverlayView=parentView;  
  31.  [self presentModalViewController:picker animated:YES];    
  32.  [picker release];  
  33.  // start our processing timer  
  34.  processingTimer=[NSTimer scheduledTimerWithTimeInterval:1/5.0f target:self selector:@selector(processImage) userInfo:nil repeats:YES];  
  35. }  
The important line of code here is:
  1. // and put our overlay view in  
  2. picker.cameraOverlayView=parentView;  
This puts our view on top of the cameras screen. We can now start grabbing images from the screen using the UIGetScreenImage:
  1. // this is where is all happens  
  2. CGImageRef UIGetScreenImage();  
  3.   
  4. -(void) processImage {  
  5.  // grab the screen  
  6.  CGImageRef screenCGImage=UIGetScreenImage();  
  7.  // turn it into something we can use  
  8.  Image *screenImage=fromCGImage(screenCGImage, overlayView.frame);  
  9.  CGImageRelease(screenCGImage);  
  10.  // process the image to remove our drawing - WARNING the edge pixels of the image are not processed  
  11.  Image *drawnImage=overlayView.drawnImage;  
  12.  for(int y=1;y<screenimage->height-1; y++) {  
  13.   for(int x=1; x<screenimage->width-1; x++) {  
  14.    // if we draw to this pixel replace it with the average of the surrounding pixels  
  15.    if(drawnImage->pixels[y][x]!=0) {  
  16.     screenImage->pixels[y][x]=(screenImage[y-1][x]+screenImage[y+1][x]+  
  17.           screenImage[y][x-1]+screenImage[y][x+1])/4;  
  18.    }  
  19.   }  
  20.  }  
  21.  // do something clever with the image here and tell the overlay view to draw stuff  
  22.  // simple edge detection and following:  
  23.  CGMutablePathRef pathRef=CGPathCreateMutable();  
  24.  int lastX=-1000, lastY=-1000;  
  25.  for(int y=0; y<screenimage->height-1; y++) {  
  26.   for(int x=0; x<screenimage->width-1; x++) {  
  27.    int edge=(abs(screenImage->pixels[y][x]-screenImage->pixels[y][x+1])+  
  28.        abs(screenImage->pixels[y][x]-screenImage->pixels[y+1][x]))/2;  
  29.    if(edge>10) {  
  30.     int dist=(x-lastX)*(x-lastX)+(y-lastY)*(y-lastY);  
  31.     if(dist>50) {  
  32.      CGPathMoveToPoint(pathRef, NULL, x, y);  
  33.      lastX=x;  
  34.      lastY=y;  
  35.     } else if(dist>10) {  
  36.      CGPathAddLineToPoint(pathRef, NULL, x, y);  
  37.      lastX=x;  
  38.      lastY=y;  
  39.     }  
  40.    }  
  41.   }  
  42.  }  
  43.  // update the overlay view  
  44.  [overlayView setPath:pathRef];  
  45.  //////////////  
  46.   
  47.  // finished with the screen image  
  48.  destroyImage(screenImage);  
  49. }  
  50. </screenimage-></screenimage-></screenimage-></screenimage->  
For this example I'm doing some pretty basic edge detection and putting the edges into a CGPath. This is then drawn by the overlay view.

The important lines of code are here. We ask the overlay view for the image it drew to the screen, then we go through each pixel to see if we drew to it and if we did replace it with the average of the surrounding pixels. This way we remove any artefacts from out drawing at the loss of some screen resolution:
  1. // process the image to remove our drawing - WARNING the edge pixels of the image are not processed  
  2. Image *drawnImage=overlayView.drawnImage;  
  3. for(int y=1;y<screenimage->height-1; y++) {  
  4.  for(int x=1; x<screenimage->width-1; x++) {  
  5.   // if we draw to this pixel replace it with the average of the surrounding pixels  
  6.   if(drawnImage->pixels[y][x]!=0) {  
  7.    screenImage->pixels[y][x]=(screenImage[y-1][x]+screenImage[y+1][x]+  
  8.          screenImage[y][x-1]+screenImage[y][x+1])/4;  
  9.   }  
  10.  }  
  11. }  
  12. /screenimage-></screenimage->  
That's it! The full sample app is available here: http://dl.dropbox.com/u/508075/augmented_reality/AugmentedRealitySample.zip

16 comments:

志兴 said...

nice work and thanks for sharing.that helps me a lot

Jennifer said...

Thanks for sharing this, I really appreciate it. Incredibly helpful.

Vi said...

Thanks for writing this, I found it very helpful. One suggestion, if you are releasing source code to the public then you should include a software license. Otherwise no one knows the legal ways they can use the code.

trilobite said...

Thanks for sharing helpful information.
By the way, I have ported your sample code to C# (MonoTouch). I have written a blog entry how to start augmented reality using MonoTouch. May I have your permission to publish it?

http://blog.reinforce-lab.com/2010/02/monotouchaugmented-reality-how-to.html

Henry said...

Thank you for sharing. This example is invaluable for me.

IpUrBeLtZ said...

Neat! Thanks for your work!! ^^

Mareikiii said...

wow great work!

지윤서윤 said...

Great job, and thanks for your sharing code!!!

patrick said...

Great job! Thank you for this sharing... web development

Hyung Gil said...

Thanks for your sharing code.

and I have one Question.

How convert image file to video file?

Please Help me.

my email address is gilgoona@gmail.com.

thanks for reading.

David said...

Thanx a lot! I have been searching for this for ages! Nobody was able to tell me where to start with augmented reality and here it is all together with an example. Great job and thanx a lot once more!

Sebastian said...

I am so thankful! You are truly a role model for developers!

Martin said...

Wow, this is so cool, thanks for sharing! I have been studying computer imaging and did my thesis on augmented reality some years ago, and has also written a simple sudoku solver in php once upon a time. I got the idea to add those two together in an iPhone app, but after a little googling I can see that you've done this in a much more nice-looking way than I could ever do. Now I have to come up with something else :)

Oded said...

Very useful post!
Very very useful post!

So useful, I bought the app, even though I don't play Soduko anymore... :)

Mobile Kingdoms said...

Full Service iPhone + Smart Phone Development

Mobile Kingdoms is ideally positioned to handle all steps of launching a successful app. Choose us for your app's strategy, development, marketing, or all three. We are equally accustomed to handling everything ourselves or collaborating with your organization's existing technology & design resources. Contact us at support@mobilekingdoms.com
Strategy
The Opportunity

* Migrate an existing web service to mobile or start anew
* Leverage the salience of mCommerce, the ubiquity of smart phones, and a real-time communication medium
* Create a plan for app deployment in less than month!

Mobile Brand Reach

* Reaching new demographics
* Cutting through the clutter
* Understanding the audience

The Broader Context

* What are the business objectives?
* How does this fit in mobile today?
* What are the key success metrics?
* Explore other opportunities

Development
Timeline & Planning

* Resource breakdown for app and server components
* Finalizing the scope of the project
* Wire-framing & prototyping for client review and sign off

Coding & UI Design

* Programming and creation of graphical elements in tandem
* Integration with existing or new APIs
* Stellar and intuitive UI that is fully Apple Human Interface Guideline compliant.

Testing and Improvement

* Deploy completed versions of the app on up to 100 iPhones or smart phones for testing
* Stress testing of server side elements
* Last minute app “tweaks” and graphical improvements

Marketing
App Store or Enterprise Deployment

* Leverage natural App Store “SEO”
* Catalyze adoption through mobile ad placement
* Contact relevant PR and media channels

Iterate

* Collect real-time user data including: app use duration, GPS location, and user demographics
* Collaborative moderated feedback mechanisms
* Build loyalty by engaging users directly within applications

Analyze

* Measure and adjust against initial goals
* Daily analytic reports
* Track conversions across systems for comprehensive ROI

http://www.mobilekingdoms.com

RMD said...

Love the sound track to the video, what's it from?

Post a Comment

 

Web annotations