OpenGL ES 2 – Video on iPhone in 3D, and Texture-Map sources

This is a blog post I wrote 6 months ago, and my life got too busy to finish it off, finish testing the code, improving the text. So I’m publishing this now, warts and all. There are probably bugs. There are bits that I could explain better – but I don’t have time. Note that the code is ripped from apps where I ACTUALLY USED IT – so I know it does work, and works well!

In brief:

  1. Using different “sources” of textures in OpenGL, not just images!
  2. Using CALayer’s as a source — allows for video in 3D!
  3. Using “Video straight from the video decoding chip” as a source – faster video!

Procedural vs Mapped Textures

Now you’ve made both Procedural Textures (Part 5) and TextureMapped Textures (Part 6), it’s time to see what they can each do.

The Real World versus The Digital World

You can use a photo as a TextureMap, isn’t that great. Does it look like the thing you photo’d?

Not much. In reality, the object should have hilights on sharp edges, and have shadows in the cracks. It probably refracts light (anything even partially transparent – including human skin). Some parts of it might be fractal – e.g. the bumpiness of waves in the sea remains the same even as you zoom out. Some parts would be static – e.g. in a brick wall: the arrangement of mortar between bricks doesn’t reappear at smaller scale as you zoom in!

So … in an ideal world … we’d use Procedural textures for everything. We’d get realistic lighting, software control of every variable.

But the world is full of video and photos already – billions of them – and none of them.

So, to make something look realistic in 3D, you need to describe it completely in software.

Sources

Every texture in GL has a source.

If it’s a texturemap, the simplest case: the “source” is a chunk of RAM on the GPU, labelled with an integer.

If it’s a procedural texture, the “source” is the GPU-program (Vertex/Fragment Shader pair) that generates colours on the fly. Since the GPU program could reference anything as input data, it may in turn have sources that feed into it. A common setup is:

  • Procedural Texture: reptile-skin … MADE OF:
    • Texture-Map texture: “small scratches.png”
    • Procedural Texture: large hexagonal scales/plates … MADE OF:
      • Procedural Texture: green-tint, with red shadows
      • Procedural Texture: lighting … MADE OF:
        • Texture-Map texture: light-map
        • Procedural Texture: dynamic white/black lighting … MADE OF:
          • Texture-Map texture: Normal-map
          • Procedural Texture: lighting model (Fresnel)

In both cases, we can trivially swap the source for a different one, render our Draw call, and it will use the new texture instead.

Recap on Texturing

We know:

  1. TextureMaps are huge, slow to upload to GPU, and “dumb data”; if you want them to change, you have to re-upload the bytes. Your GPU RAM limits how complex they can get.
  2. Procedural Textures are tiny, but have to be re-executed for every pixel you render; your GPU speed limits how complex they can get.
  3. In GL ES 2, the two systems are partially merged/integrated. TextureMaps need a simple Procedural texture (e.g. a one-line Fragment Shader, that passes-through the Texture Map values) to render them.
  4. Neither system uses BufferObjects, even though BO/VBO’s are designed for moving big chunks of data to/from the GPU, largely because textures were “invented” / added to GL long before VBO’s existed
  5. Shader Uniforms let us animate textures internally

you understand the previous two tutorials (Part 5: Texturing (procedural), Part 6: TextureMapping (bitmaps)). Now we can get onto the fun stuff…

There’s almost an infinite amount of cool things you can do once you master GL’s texturing systems.

Animated textures: the magic of Uniforms

What comes next is a little odd for developers who are new to OpenGL: to efficiently animate something in OpenGL, you simply “render it differently each frame”. As I explained in Part 2, OpenGL is re-drawing everything, on every frame, already – so that this kind of “animation” comes for free.

To make this work, we’ll either need to re-upload our data (the shader program itself, or the Attributes for each Draw call) every frame (with new values) … or: we need some kind of variable in the Shader that we can change from Frame to Frame.

http://t-machine.org/index.php/2014/01/05/glkit-extended-refactoring-the-view-controller/

Summary of previous two posts:

  1. Texturing in GL ES 2: REQUIRES an algorithm (executes on the GPU, as a ShaderProgram)
    • (see Part 5 for source code + explanation on doing this)
  2. Texturing in GL ES 2: OPTIONALLY uses a bitmap (uploaded to the GPU’s RAM)
    • (see Part 6 for source code + explanation on doing this)
  3. Apple provides a 1-line method to read “any PNG or JPG” and upload it to the GPU
  4. OpenGL has a set of low-level methods to upload “raw bitmap data” to the GPU (we haven’t used those yet…)

There’s a huge world of options and features and effects you can do with OpenGL texturing. I’m trying to give you “just enough” to get you up-and-running on iOS, and then use Google (and OpenGL.org) to explore the other features yourself.

But there’s a couple of important things that are specific to iOS, and you won’t find in most GL texts…

Embedding UIKit, CALayer and CGContext -> OpenGL

Every windowing system needs a lowest common denominator¬†(LCD) where all the GUI code eventually gets converted into instructions for the GPU hardware. With Apple’s OS’s, that’s CALayer. UIKit has its own LCD (UIView) – but even UIView uses a CALayer internally.

If we could integrate CALayer with OpenGL textures, we’d be able to convert any and all graphics on iOS into 3D. You could draw UITableView, or UILabel, inside an OpenGL view. You could take an UIImage and use it on the GPU (instead of “only images loaded from a PNG file”, as provided by Apple’s GLKit).

You might have heard of CATransform3D, which seems to do the same thing. Under the hood, Apple uses OpenGL to implement CALayer. So, they’ve already done this integration – CALayer has a property “transform”. This is NOT the normal CGAffineTransform you see in UIKit – instead it’s a CATransform3D. That property is an OpenGL 4×4 transform matrix, and you can use it to move and rotate CALayers “in 3D”.

Unfortunately, CATransform3D is very limited (blocks you from using: Shaders, Textures, 3D Geometry), and it has some nasty bugs. Few developers use it, and it seems Apple has given up on it. My advice: don’t bother.

UIKit/CALayer’s rendering model

CALayer unifies all Apple rendering (both “to screen”, and “to CFImageRef / UIImage”) with a simple architecture:

  1. Everything, ultimately, renders to a CGContextRef.
  2. When Apple is drawing to screen: Apple gives you a CGContextRef to use, and when you’re finished, they draw it to the screen
  3. When you’re creating your own images programmatically: you create your own CGContextRef and convert it into an Image
  4. All of Apple’s graphics classes (CALayer, UIView, UIKit, etc) have a method to draw themselves directly to a CGContextRef: “renderInContext:”

Externally, Apple provides special methods that let any developer “pull out” the results in a couple of formats:

  1. As a CFImageRef (almost identical to UIImage, but using Apple’s C-library)
  2. (iOS only): as a new UIImage
  3. As NSData (raw bytes, like a BMP or TIFF file)

I’ve seen people use the UIImage method above to do something like this (and, tragically, Apple’s engineering staff recommended it – but it’s a very stupid thing to do):

  • (DONT DO THIS!)
    1. Get a CGContextRef from Apple
    2. Draw to it / use UIKit methods / whatever
    3. Get a UIImage out of the CGContextRef
    4. Use Apple’s UIImagePngRepresentation to convert the UIImage to a PNG file (in memory)
    5. Use Apple’s GLKTextureLoader to load the in-memory file to the GPU as a texture

This works (and I’ve used it to debug my code in the past), but it’s insane. PNG is designed to be hugely CPU-intensive to compress, but fast to decompress. The net effect: this technique is often 50x slower than the correct approach – instead of “0.1 seconds”, it can easily take 5-10 seconds to send a single screen of data. If you’re trying to get realtime animation or rendering, that makes it impossible!

Instead … both OpenGL and Apple support “raw stream of bytes, 4 numbers per pixel (Red, Green, Blue, Alpha)”. Apple’s preferred class for arrays-of-bytes is NSData.

So, we’ll use the same mechanism, but upload straight to the GPU as a new texture. We only need a couple of static methods, and the output will be a GLK2Texture, so we store them in an Objective-C Category on GLK2Texture itself:

GLK2Texture+CoreGraphics.h:

@interface GLK2Texture (CoreGraphics)

+(CGContextRef) createCGContextForOpenGLTextureRGBAWidth:(int) width h:(int) height bitsPerPixel:(int) bpp shouldFlipY:(BOOL) flipY fillColorOrNil:(UIColor*) fillColor;

+(GLK2Texture*) uploadTextureRGBAToOpenGLFromCGContext:(CGContextRef) context width:(int)w height:(int)h;

@end

GLK2Texture+CoreGraphics.m:

@implementation GLK2Texture (CoreGraphics)

+(CGContextRef) createCGContextForOpenGLTextureRGBAWidth:(int) width h:(int) height bitsPerPixel:(int) bpp shouldFlipY:(BOOL) flipY fillColorOrNil:(UIColor*) fillColor
{
...

You can create the CGContextRef yourself, manually, but it’s easy to get wrong. Apple’s library is a bit buggy on iOS (might get fixed in iOS 8?), and most of the options don’t work correctly for this specific use-case – so I recommend you use this method, which uses the “correct” format for the byte-array.

...
	NSAssert( (width & (width - 1)) == 0, @"PowerVR will render your texture as ALL BLACK because you provided a width that's not power-of-two");
	NSAssert( (height & (height - 1)) == 0, @"PowerVR will render your texture as ALL BLACK because you provided a height that's not power-of-two");
...

Important: if you try to capture a CGContext with width OR height that’s not “a power of two”, the GPU will silently reject it; debugging this is a nightmare. The GPU is capable of NPOT (non-power-of-two) textures – but requires a special image-format to make it work (“PVR”). So: we Assert here in case you ever send a NPOT CGContext by accident.

	/** Create a texture to render from */
	/*************** 1. Convert to NSData */
	CGColorSpaceRef colorSpace = CGColorSpaceCreateDeviceRGB();
	CGContextRef context = CGBitmapContextCreate( NULL, width, height, 8, 4 * width, colorSpace, /** NB: from Apple: incompatible types EXPECTED here according to API docs! */ (CGBitmapInfo) kCGImageAlphaPremultipliedLast  );
	CGColorSpaceRelease( colorSpace );

…that’s standard code for creating a custom CGContextRef for use anywhere in their CA/Quartz API. Check the CGBitmapContextCreate docs for more info (if interested).

Note: the typecast looks dodgy (blame Apple, someone forgot to copy/paste the enum values across) – but it’s “officially” correct. Also, Apple apparently hasn’t implemented the full API on iOS (it was originally OS X only) – so if you send one of the other enum values, Apple will either crash or silently fail. Finally … despite the name of this constant (kCGImageAlphaPremultipliedLast), Apple seems to send NON pre-multiplied alpha. Basically: this method sucks donkey, and hopefully someone at Apple will clean it up one day!

	
	if( fillColor != nil )
	{
		CGContextSetFillColorWithColor(context, fillColor.CGColor);
		CGContextFillRect(context, CGRectMake(0,0,width,height));
	}
	
	if( flipY )
	{
		CGAffineTransform flipVertical = CGAffineTransformMake( 1, 0, 0, -1, 0, height );
		CGContextConcatCTM(context, flipVertical);
	}
	
	return context;
}

…convenience code: if you’re generating a texture from a CALayer that comes from UIKit (i.e. from a UIView), Apple will automatically flip it upside-down. Which is tragic, since OpenGL and CALayer use the same idea of “up” – no flip was needed! But the support for “not flipping it” only exists on OS X, and Apple has deprecated it; in short: we do it ourselves.

For CALayer’s that were created by CoreAnimation methods (not UIKit methods), the flipping is unnecessary.

+(GLK2Texture*) uploadTextureRGBAToOpenGLFromCGContext:(CGContextRef) context width:(int)w height:(int)h
{
	void* resultAsVoidStar = CGBitmapContextGetData(context);
	
	size_t dataSize = 4 * w * h; // RGBA = 4 * 8-bit components == 4 * 1 bytes
	NSData* result = [NSData dataWithBytes:resultAsVoidStar length:dataSize];
	
	CGContextRelease(context);

…again, this is standard Apple code for “create an array of bytes, containing raw colours for each pixel, from the contents of a CGContextRef”. But the width, height, and number-of-bytes-per-pixel have to match with the numbers we used in the previous method.

	
	/*************** 2. Upload NSData to OpenGL */
	GLK2Texture* newTextureReference = [[[GLK2Texture alloc] init] autorelease];
	

Reminder: our GLK2Texture class automatically allocates a new texture on the GPU (as we did with VBOs, VAOs, etc), and saves the GPU name as GLK2Texture.glName

	glBindTexture( GL_TEXTURE_2D, newTextureReference.glName);

…as with VBO/VAO/etc, we have to choose (“bind”) an active texture before configuring it…

	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER, GL_LINEAR);

…when the texture has too little detail (e.g. you zoomed-in too far), this controls how the GPU blurs/interpolates/smooths it. This is also how you’d enable and configure Mipmapping (but we’re ignoring that for now).

	glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA, (int)w, (int)h, 0, GL_RGBA, GL_UNSIGNED_BYTE, [result bytes]);
	
	return newTextureReference;
}

…finally, we upload the texture itself – and tell the GPU what width and height it is (these are saved on the GPU so it can render the texture later).

Demo: CGContext drawing on a 3D triangle

Assuming you’re using the refactored version of the sample code, we can now demo this very quickly. Make a GLK2DrawCallViewController subclass, and make a single Draw call:

CALayerTextureViewController.m:

@implementation CALayerTextureViewController

-(NSMutableArray*) createAllDrawCalls
{	
	/** All the local setup for the ViewController */
	NSMutableArray* result = [NSMutableArray array];
	
	/** -- Draw Call 1: triangle that contains a CALayer texture
	 */

	GLK2DrawCall* dcTri = [CommonGLEngineCode drawCallWithUnitTriangleAtOriginUsingShaders:
						   [GLK2ShaderProgram shaderProgramFromVertexFilename:@"VertexProjectedWithTexture" fragmentFilename:@"FragmentWithTexture"]];
	
	/** Do some drawing to a CGContextRef */
	CGContextRef cgContext = [GLK2Texture createCGContextForOpenGLTextureRGBAWidth:256 h:256 bitsPerPixel:8 shouldFlipY:FALSE fillColorOrNil:[UIColor blueColor]];
	CGContextSetFillColorWithColor( cgContext, [UIColor yellowColor].CGColor );
	CGContextFillEllipseInRect( cgContext, CGRectMake( 50, 50, 150, 150 ));
	
	/** Convert the CGContext into a GL texture */
	GLK2Texture* newTexture = [GLK2Texture uploadTextureRGBAToOpenGLFromCGContext:cgContext width:256 height:256];
	
	/** Add the GL texture to our Draw call / shader so it uses it */
	GLK2Uniform* samplerTexture1 = [dcTri.shaderProgram uniformNamed:@"s_texture1"];
	[dcTri setTexture:newTexture forSampler:samplerTexture1];
	
	/** Set the projection matrix to Identity (i.e. "dont change anything") */
	GLK2Uniform* uniProjectionMatrix = [dcTri.shaderProgram uniformNamed:@"projectionMatrix"];
	GLKMatrix4 rotatingProjectionMatrix = GLKMatrix4Identity;
	[dcTri.shaderProgram setValueOutsideRenderLoopRestoringProgramAfterwards:&rotatingProjectionMatrix forUniform:uniProjectionMatrix];
	
	[result addObject:dcTri];
	
	return result;
}
@end

You should see the triangle with a yellow-circle on a blue-background:

IIIIIIMMMAAAAAGGGGEEEEEEE

It’s a very bad idea to overwrite a texture that’s already being rendered, but on alternate frames you could create a new GLK2Texture, upload a new snapshot of your CGContext, and then tell the drawcall to switch textures. This way you can easily render UIKit widgets directly into OpenGL – stick them on the walls of your 3D world.

NOTE: Apple’s UIScrollView has never quite worked properly, and any Apple widget – e.g. UITableView – that contains a UIScrollView inherits its problems. If/when you use this approach to put UIKit widgets into 3D, you’ll get problems with touch-handling and scrolling. By default, Apple freezes OpenGL when a UITableView starts scrolling. This seems like a bug, but it’s a major performance optimization for UITableView. Plenty of Googling ahead of you if you want to go down that route…

Embedding live video from the iPhone camera -> OpenGL

Using the above trick, and the iOS video API’s, you could:

  1. Save video to disk
  2. Load it from the file
  3. Take each frame, and upload as a new GL texture

…but that won’t work for “live” video straight from the camera. You could get a bit cleverer, and use the low-level callbacks in AVFoundation to instead:

  1. Take any playing video
  2. “Capture” individual frames while it decodes / streams
  3. Draw the frame to a CGContextRef
  4. Use the code from previous section to upload the CGContext’s contents to a GL texture

…and this works fine. For small videos, it’s pretty fast too. But isn’t this a bit ridiculous – all phones since the iPhone3GS contain dedicated “video chip” that handles the camera, and handles decoding MPG streams. If the data is already there, shouldn’t we be keeping it there, instead of downloading to CPU then uploading to GPU? Indeed!

What we’ll do now is:

  1. Take any video-source from AVFoundation (live video from camera, video loaded from disk, etc)
  2. Use a special Apple-provided Capture that keeps the data in raw video format on the device
  3. Use a special Apple-provided method to re-route the frames directly to the GPU
  4. Use a special Shader that renders the frames in the native video format (faster than converting to RGB)

…but it’s a lot of complex code. This took me a while to get right ;).

Use AVFoundation to capture (or decode) some video

AVF is awesome; if you don’t know how to use it yet, I recommend reading some tutorials and playing with it. It allows you to treat “any video, audio, or combination” as input, and do anything with it – save to disk, modify it on the fly, add graphical overlays, swap the sound-track, or even: upload it to OpenGL. And the code is identical, save for 1-3 lines, whether you’re taking video from the camera, or reading it from disk.

NOTE: make sure you add AVFoundation.framework to your project’s “Link Binary with Libraries” Build Phase; otherwise Xcode will rightly fail to build!

We’ll need a new ViewController, which supports the callback AVF uses when you’re capturing frames:

VideoTextureViewController.h:

#import <AVFoundation/AVFoundation.h>

@interface VideoTextureViewController : GLK2DrawCallViewController <AVCaptureVideoDataOutputSampleBufferDelegate>

@end

…and we have to do some fairly standard AVF setup. But first, to support the GL capture/conversion, we have to use Apple’s CoreVideo, and pre-create a “texture cache” object:

VideoTextureViewController.m:

@interface VideoTextureViewController ()
{
	#pragma mark - Apple efficient video textures
	CVOpenGLESTextureCacheRef coreVideoTextureCache;
}
...
- (AVCaptureSession*) setupAVCapture
{	
    //-- Create CVOpenGLESTextureCacheRef for optimal CVImageBufferRef to GLES texture conversion.
#if COREVIDEO_USE_EAGLCONTEXT_CLASS_IN_API
    CVReturn err = CVOpenGLESTextureCacheCreate(kCFAllocatorDefault, NULL, self.localContext, NULL, &coreVideoTextureCache);
#else
    CVReturn err = CVOpenGLESTextureCacheCreate(kCFAllocatorDefault, NULL, (__bridge void *)self.localContext, NULL, &coreVideoTextureCache);
#endif
    if (err)
    {
        NSLog(@"Error at CVOpenGLESTextureCacheCreate %d", err);
        return  nil;
    }
...

Note that this CVOpenGLESTextureCacheCreate method from Apple requires our EAGLContext, as well as a pointer to our (currently empty) texture-cache. This allows Apple to do efficient conversion later on.

Now we get on to typical AVF config: we create an AVSession and add an AVCaptureDevice to provide input (in this case: we’ll use the default Camera):

...
    AVCaptureSession* session;
    //-- Setup Capture Session
    session = [[[AVCaptureSession alloc] init] autorelease];
	[session retain];
    [session beginConfiguration];
	
    //-- Set preset session size.
    [session setSessionPreset:AVCaptureSessionPreset640x480];
	
    //-- Creata a video device and input from that Device.  Add the input to the capture session.
    AVCaptureDevice * videoDevice = [AVCaptureDevice defaultDeviceWithMediaType:AVMediaTypeVideo];
    if(videoDevice == nil)
        assert(0);
	
    //-- Add the device to the session.
    NSError *error;        
    AVCaptureDeviceInput *input = [AVCaptureDeviceInput deviceInputWithDevice:videoDevice error:&error];
    if(error)
        assert(0);
	
    [session addInput:input];
...

Then it gets interesting again. Instea of simply grabbing the output, we ask for the output in the same format it starts in, so that we spare the hardware from converting it back-and-forth (improving performance):

...
    //-- Create the output for the capture session.
	AVCaptureVideoDataOutput * dataOutput = [[AVCaptureVideoDataOutput alloc] init];
	[dataOutput setAlwaysDiscardsLateVideoFrames:YES]; // Probably want to set this to NO when recording
	
    //-- Set to YUV420.
	[dataOutput setVideoSettings:[NSDictionary dictionaryWithObject:[NSNumber numberWithInt:kCVPixelFormatType_420YpCbCr8BiPlanarFullRange] 
	                                                         forKey:(id)kCVPixelBufferPixelFormatTypeKey]]; // Necessary for manual preview
...	

With the output created, we also use the AVF feature that gives us a callback each time a new video-frame is decoded, and point it back to the class we’re currently writing (then we add it to the AVF Session):

....
    // Set dispatch to be on the main thread so OpenGL can do things with the data
	[dataOutput setSampleBufferDelegate:self queue:dispatch_get_main_queue()];        
	
    [session addOutput:dataOutput];
	[session commitConfiguration];
...

And finally (as with normal AVF projects), we tell AVF to “start” the session (this will immediately start capturing video frames):

...
    [session startRunning];
	
	return session;
}

Use CoreVideo to convert AVFoundation video to OpenGL textures

This is where it gets tricky; the code that follows is correct – but Apple breaks some of OpenGL’s core concepts, and refuses to document how or why they did it. What we have here is a workaround based on “intelligent guesses”, “trial and error”, and “lots of experiments to reverse-engineer Apple’s code”.

As a StackOverflow question from almost 2 years ago puts it:

“Where is the official documentation for CVOpenGLESTexture method types?” … “Unfortunately, there really isn’t any”

Adam’s interpretation of CoreVideo, CVOpenGLESTexture, OpenGL ES upload on iOS

If Apple ever documents their API (!), I’ll come back and revise this. You don’t need to understand this to make it work – but if you want to tweak it or change it, it would help! So far as I can tell:

  1. When AVF has a frame in-memory, CoreVideo (CV) can access and act on that frame without downloading it to the CPU
  2. Because the frame doesn’t exist on the CPU, Apple has to “fake” it to make it work with OpenGL (not Apple’s fault, it’s a side-effect of GL’s very old TextureMapping API)
    • (if my guess here is correct … perhaps a better solution from Apple would have been to extend OpenGL (as they’ve done before) and create OpenGL methods that supported this)
  3. Apple has a class called a “texture cache” that does not necessarily cache textures; instead, it provides virtual textures backed by GPU memory (but not OpenGL memory)
  4. Video-decoding happens on a different hardware chip to the one that does GL rendering; this means we have a multi-threading problem, by definition
  5. The texture-cache (which isn’t a cache) automatically implements double-buffering internally to get around that. Depending on internal data, it changes the name of each GL texture from frame-to-frame.
  6. ALSO: if the GPU blocks the Video Decoder at “the wrong moment”, it switches to triple-buffering, and then goes back to double-buffering
  7. Net effect: over time, if the GPU is doing a lot of work, Apple’s code can end up generating infinitely-many virtual textures, instead of just 1 (or 2 for double-buffering)

Confused? No worries. Here’s some boilerplat code that does the job…

Convert a video-frame to OpenGL

We implement the callback method from AVF:

VideoTextureViewController.m:

- (void)captureOutput:(AVCaptureOutput *)captureOutput didOutputSampleBuffer:(CMSampleBufferRef)sampleBuffer fromConnection:(AVCaptureConnection *)connection
{
...

First we grab some metadata about the frame – e.g. the width and height in pixels (could change from frame to frame if you switched camera in the middle of running the app):

...
    CVReturn err;
    CVImageBufferRef pixelBuffer = CMSampleBufferGetImageBuffer(sampleBuffer);
    GLsizei width = (GLsizei)CVPixelBufferGetWidth(pixelBuffer);
    GLsizei height = (GLsizei)CVPixelBufferGetHeight(pixelBuffer);
	if (!coreVideoTextureCache)
    {
        NSLog(@"No video texture cache");
        return;
    }
...

Remember: (my guess) the coreVideoTextureCache isn’t really a cache, it’s holding “virtual” textures. We can safely “release” them, and it will have no effect – they weren’t real to start with.

Also, this coreVideoTextureCache seems to “ignore” successive frames unless you explicitly call the “Flush” method; Apple’s header files seem to say it’s not necessary, that flushing is automatic – but this is not true (try commenting it out).

We’re CFRelease’ing because we’re about to use a C method with the word “Create” in it, that means “automatically retain it”.

...
		if( appleCreatedTexture1 != NULL )
		CFRelease( appleCreatedTexture1 ); // Apple requires this
		if( appleCreatedTexture2 != NULL )
		CFRelease( appleCreatedTexture2 ); // Apple requires this

		// Periodic texture cache flush every frame
		CVOpenGLESTextureCacheFlush(coreVideoTextureCache, 0); // Apple requires this
...

Now we “create” new virtual-textures from the cache. We need two textures for one frame, because the default encoding for video is a pair of images (Y and UV).

...
 // Y-plane
    err = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault, 
                                                       coreVideoTextureCache,
                                                       pixelBuffer,
                                                       NULL,
                                                       GL_TEXTURE_2D,
                                                       GL_RED_EXT,
                                                       width,
                                                       height,
                                                       GL_RED_EXT,
                                                       GL_UNSIGNED_BYTE,
                                                       0,
                                                       &appleCreatedTexture1);
	
    if (err) 
    {
        NSLog(@"Error at CVOpenGLESTextureCacheCreateTextureFromImage %d", err);
    }   
	
    // UV-plane
    err = CVOpenGLESTextureCacheCreateTextureFromImage(kCFAllocatorDefault, 
                                                       coreVideoTextureCache,
                                                       pixelBuffer,
                                                       NULL,
                                                       GL_TEXTURE_2D,
                                                       GL_RG_EXT,
                                                       width/2,
                                                       height/2,
                                                       GL_RG_EXT,
                                                       GL_UNSIGNED_BYTE,
                                                       1,
                                                       &appleCreatedTexture2);
	
    if (err) 
    {
        NSLog(@"Error at CVOpenGLESTextureCacheCreateTextureFromImage %d", err);
    }

At this point, we have:

  1. A virtual texture with Y / luminance
  2. A virtual texture with UV / Chroma
  3. Both textures have been “forced” to update themselves with the latest video-frame (thanks to the …Flush… command)

We want to create simple, standard GL texture references out of these virtual textures. We’ve got a nice, simple class for that – GLK2Texture.

But it’ll need extending to support “virtual” textures:

  1. The textures are created by CoreVideo, and already have a glName; we need a new constructor to GLK2Texture
  2. The virtual textures must NOT be deleted from GPU when the GLK2Texture deallocs (this corrupts the rendering state); we need a way to turn-off that behaviour
  3. When Apple changes the virtual texture to a different glName, we need to switch our glName too without deleting/reallocing our GLK2Texture; it’s the same texture, but its GPU name has changed

These are odd (or dangerous) changes, specific to CoreVideo, so we put them into a category on GLK2Texture:

GLK2Texture+CoreVideo.h:

@interface GLK2Texture (CoreVideo)

+(GLK2Texture*) texturePreCreatedByApplesCoreVideo:(CVOpenGLESTextureRef) appleCoreVideoTexture

-(void) liveAlterGLNameToWorkaroundAppleCoreVideoBug:(GLuint) newName

@end

…with an ordinary constructor, reading the virtual-texture’s name at runtime:

GLK2Texture+CoreVideo.m:

@implementation GLK2Texture (CoreVideo)

+(GLK2Texture*) texturePreCreatedByApplesCoreVideo:(CVOpenGLESTextureRef) appleCoreVideoTexture
{
	GLK2Texture* newValue = [[[GLK2Texture alloc] initWithName:CVOpenGLESTextureGetName(appleCoreVideoTexture)]autorelease];
	
	return newValue;
}
...

Generally, you want GL names to be constant. We can’t do that here, but I don’t want to expose a dangerous feature for normal usage. To be on the safe side, we use an ObjC class-extension-header. This makes it “private, but visible internally to the library”:

GLK2Texture_MutableName.h:

@interface GLK2Texture()
@property(nonatomic, readwrite) GLuint glName;
@end

GLK2Texture.h:

...
#import "GLK2Texture_MutableName.h"
...

(any class that imports that header gets to treat GLK2Texture.glName as a writeable property. Classes that “only” import GLK2Texture.h see it as a readonly property)

GLK2Texture+CoreVideo imports that header, and is able to implement a method that changes this property:

GLK2Texture+CoreVideo.m:

...
-(void) liveAlterGLNameToWorkaroundAppleCoreVideoBug:(GLuint) newName
{
	self.glName = newName;
}
...

Putting this all together, we’re able to selectively create (or update) our GLK2Texture when the Apple texture changes. If we’re creating, we also add the GLK2Texture to our Drawcall – but if we’re simply overwriting the .glName, we don’t need to (the render loop will re-read the new value next time it loops round. Isn’t OOP wonderful? ;)) :

...
	BOOL glNameWasCreatedOrChanged = FALSE;
	if( self.textureVideoLuminance == nil )
	{
		glNameWasCreatedOrChanged = TRUE;

		self.textureVideoLuminance = [GLK2Texture texturePreCreatedByApplesCoreVideo:appleCreatedTexture1];
		self.textureVideoLuminance.willDeleteOnDealloc = FALSE;
		
		[self.drawCallThatRendersVideoTextures setTexture:self.textureVideoLuminance forSampler:[self.drawCallThatRendersVideoTextures.shaderProgram uniformNamed:@"s_texture1"]];
	}
	else if( self.textureVideoLuminance.glName != CVOpenGLESTextureGetName(appleCreatedTexture1) )
	{
		glNameWasCreatedOrChanged = TRUE;
		
		[self.textureVideoLuminance liveAlterGLNameToWorkaroundAppleCoreVideoBug:CVOpenGLESTextureGetName(appleCreatedTexture1)];
	}
...

But what’s this “glNameWasCreatedOrChanged” BOOL for?

Well, it turns out that Apple’s virtual textures are more than a little bizarre. By default, the render solid black – unless you tell them to “clamp texturemap to edge”. This is very strange, and even more strange: it needs re-doing every time Apple flips their double-buffer. None of this is documented, Apple’s source code does it every frame without explanation. But if you turn it off (it’s seemingly “useless” code), everything breaks.

Anyway, when it’s changed, we re-configure the GL virtual texture with clamping:

...
	if( glNameWasCreatedOrChanged )
	{
		glActiveTexture(GL_TEXTURE0);
		glBindTexture( CVOpenGLESTextureGetTarget(appleCreatedTexture1), CVOpenGLESTextureGetName(appleCreatedTexture1));
		glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_CLAMP_TO_EDGE);
		glTexParameterf(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_CLAMP_TO_EDGE);
	}
...

…then we repeat all that for the other texture (UV / Chroma). C.f. the GitHub source to see (it’s a copy/paste, with the references changed).

Custom Shader to render Y/UV video colours as RGB

OpenGL ES uses RGB / RGBA – it doesn’t support YUV and other exotic formats.

Apple provides us the Fragment shader that will do the conversion on-the-fly:

FragmentVideoPairTexture.fsh:

varying mediump vec2 varyingtextureCoordinate;

uniform sampler2D s_texture1, s_texture2;

void main()
{
	mediump vec3 yuv;
	lowp vec3 rgb;
	
	yuv.x = texture2D(s_texture1, varyingtextureCoordinate).r;
	yuv.yz = texture2D(s_texture2, varyingtextureCoordinate).rg - vec2(0.5, 0.5);
	
    // Using BT.709 which is the standard for HDTV
    rgb = mat3(      1,       1,      1,
			   0, -.18732, 1.8556,
	           1.57481, -.46813,      0) * yuv;
	
    gl_FragColor = vec4(rgb, 1);
}

Putting it all together: live video texturing a rotating cube

Finally, we can write the Draw calls that use all the above:

VideoTextureViewController.m:

...
-(NSMutableArray*) createAllDrawCalls
{	
	GLK2DrawCall* dcCube = [CommonGLEngineCode drawCallWithUnitCubeAtOriginUsingShaders:
	[GLK2ShaderProgram shaderProgramFromVertexFilename:@"VertexProjectedWithTexture" fragmentFilename:@"FragmentVideoPairTexture"]];

	[result addObject:dcCube];
	
	[self setupAVCapture];
	
	return result;
}

…and we can implement a Uniform that changes the projection every frame, causing the cube to rotate:

...
-(void)willRenderDrawCallUsingVAOShaderProgramAndDefaultUniforms:(GLK2DrawCall *)drawCall
{
	/*************** Rotate the entire world, for Shaders that support it *******************/
	GLK2Uniform* uniProjectionMatrix = [drawCall.shaderProgram uniformNamed:@"projectionMatrix"];
	if( uniProjectionMatrix != nil )
	{
		/** Generate a smoothly increasing value using GLKit's built-in frame-count and frame-timers */
		long slowdownFactor = 5; // scales the counter down before we modulus, so rotation is slower
		long framesOutOfFramesPerSecond = self.framesDisplayed % (self.framesPerSecond * slowdownFactor);
		float radians = framesOutOfFramesPerSecond / (float) (self.framesPerSecond * slowdownFactor);
		
		// rotate it
		GLKMatrix4 rotatingProjectionMatrix = GLKMatrix4MakeRotation( radians * 2.0 * M_PI, 1.0, 1.0, 1.0 );
		
		[drawCall.shaderProgram setValue:&rotatingProjectionMatrix forUniform:uniProjectionMatrix];
	}
}

Run it on the device, and you’ll get a rotating cube:

NOTE: this will crash if you run it on Simulator – there’s no camera on the Simulator!

IMG_4938.PNG IMG_4938.PNG IMG_4938.PNG