OpenGL ES2 – Shader Uniforms

There’s a famous animated GIF of an infinitely swirling snake (here’s one that’s been Harry Potterised with the Slytherin logo):

Impressive, right?

What if I said it only relies upon one variable, and that you can reproduce this yourself in 3D in mere minutes? It’ll take quite a lot longer to read and understand, but once you’ve grokked it, you’ll be able to do this easily.

Background reading

Make sure you understand:

  1. Draw Calls
  2. VAOs + VBOs
  3. and at least one kind of Texturing:

Uniforms vs. Vertex Attributes

In theory, you don’t need Uniforms: they are a special kind of Vertex-Attribute (which gives them their name: they are a “uniform attribute”).

In practice, as we’ve already seen, bitmap-based texture-mapping in GL ES requires Uniforms to make a link between Texture and Shader.

The code for sending them to the GPU is different from sending Vertex Attributes – annoyingly, it’s more complicated – but the concept is identical. Then why do we have them?

Performance.

And, as a bonus: convenience.

How many items per Spaceship?

A typical 3D model of a spaceship has:

  • 5,000 polygons
  • 10,000 vertices
  • 5 bitmap tetures

OK, fine, so …

Performance basics

Your vertices each specify which texture(s) they’re using. If you want to change the textures, from “standard” ones to “damaged” ones, you’ll have to:

  1. Upload the new texture (one-time cost; once it’s on GPU you can fast-switch between textures)
  2. Re-Upload the “uses texture 1 (out of the 5)” vertex-attribute … once for every vertex (repeated cost: has to be done every time)

Uniforms bypass this by saying:

A GL Uniform is a Vertex Attribute that has the same value for EVERY vertex. It only needs to be uploaded ONCE and is immediately applied to every vertex

But there’s more …

Each vertex can hold up to 2kb of data (in OpenGL ES; more on desktop GL) – making our ship take 10 megabytes of GPU RAM. But that’s small by today’s standards – and as the model gets more complicated, the storage needed increases.

By contrast, the number of Uniforms needed for a model is typically constant.

The net effect is that GPU vendors can afford to use faster RAM for their Uniforms, boosting performance even further.

Convenience basics

Revisiting that spaceship, if we’re using it in a game, there’s a lot more things we’ll want to include in the 3D model:

  • 10 different versions, one for each player. They’re all the same, but some elements change colour to match the Player’s colour
  • Some of the textures animate: e.g. Landing strips, with lights that strobe
  • Gun turrets need to rotate hundreds of vertices at once, without affecting their neighbours
  • … and so on.

Each of these becomes trivial when your DrawCall has global-constants – i.e Uniforms. For instance:

  • 10 different versions…:
    • The vertices that might change colour have a vertex attribute signalling this; at render-time, the shader sees this flag on the verte, and reads from a Uniform what the “actual” colour should be. Change the uniform, and the colour changes everywhere on the model at once
  • Textures animate…:
    • When you read a U,V value from the texture-bitmap, add a number to U and/or V that comes from a Uniform.
    • Mark the teture as “GL_REPEAT”, so that GL treats it like an infinitely tiled teture
    • Increase that uniform by a tiny amount each time you render a frame (e.g. 0.001), and the texture appears to “scroll”
  • Gun turrets rotate…:
    • Use a second DrawCall to draw the turrets.
    • Each turret has a Uniform “rotation angle in degrees”
    • When rendering, your Shader pre-rotates ALL vertices in each turret by the Uniform’s value.
    • Per frame, change the Uniform for “rotation angle”, and the whole turret rotates at once
  • …etc…

Implementing and Using Uniforms in OpenGL

Render / Update cycle for Uniforms

With Vertex-Attributes, it was easy:

  1. CPU: Generate geometry (load from a .3ds file; algorithm for making a Cube; etc)
  2. GPU: Create 1 or more VBO’s to hold the data on GPU
  3. CPU->GPU: Upload the geometry from CPU to GPU in a single big chunk
  4. Every frame: GPU reads the data from local RAM, and renders it
  5. To change the data, re-do all the above

With Uniforms, it’s more tricky.

Firstly – like everything else in Shaders – Uniforms ignore any OpenGL features that already existed. The GPU intelligently selected the correct VBOs each frame, by using the data inside the VAO. But Shaders ignore the VAO, and need to be manually switched over.

Secondly – in VBO’s, OpenGL does not care what format your data is in. But with Uniforms, suddenly it does care: you have to specify, every time you upload them.

Thirdly – GL uses an efficient C-language mechanism for uploading Uniforms with minimal overhead. With VBO’s, the VAO took care of this automatically, but again: Shaders need you to do it by hand.

Together, these complicate the process:

  1. CPU: generate a value for the Uniform
  2. CPU: create an area in RAM that will hold the value, and place it there
  3. GPU: automatically creates storage for the Uniform when you compile/link the ShaderProgram
  4. CPU->GPU: switch to the specific ShaderProgram that will use the Uniform
  5. CPU->GPU: don’t send the data; instead, send the “memory-address” of the data
  6. CPU->GPU: upload using one of thirty three unique methods (instead of the one for Verte Attributes)
  7. Every frame: GPU reads the data from local RAM, but each ShaderProgram has its own copy
  8. To change the data, re-do all the above

Uploading a value to a Uniform

After you’ve linked your ShaderProgram, you can ask OpenGL about the Uniforms it found. For each Uniform, you get:

  1. The human-readable name used in the GLSL file
  2. The OpenGL-readable name generated automatically (an integer: GLint)
  3. The GLType (int, bool, float, vec2, vec3, vec4, mat2, mat3, mat4, etc)
  4. Is this one value, or an array of values? If it’s an array: how many slots in the array? (all GLSL arrays are fied length)

The GLType has to be saved, because when you want to upload, there’s a different upload method for each distinct type:

glUniform1i – sends 1 * integer

glUniform3f – sends 3 * floats

glUniformMatri4fv – sends N * 4×4-matrices, each using floats internally

etc

To handle this automatically, I wrote three chunks of code:

  1. GLK2Uniform.h/m:: stores the GLType, is it a Matrix or Vector (or float?), etc
  2. GLK2ShaderProgram.h/m .. -(NSMutableDictionary*) fetchAllUniformsAfterLinking: parses the data from the Shader, and creates GLK2Uniform instances
  3. GLK2ShaderProgram.h/m .. -(void) setValue:(const void*) value forUniform:(GLK2Uniform*) uniform: uses the GLType data etc to pick the appropriate GL method to upload this value to the specified Uniform

That last method takes “const void*” as argument: i.e. it has no type-checking. I find this much simpler than continually specifying the type. It also intelligently handles dereferencing the pointer (for matrices and vectors) or not (for ints, floats, etc).

Uniforms and VAOs: a missing feature from OpenGL

So far, we’ve used VAO’s. They’re very useful, seemingly they:

…store all the render-state that is specific to a particular Draw call

Tragically: Shaders ignore VAO’s. Once you start using Uniforms, you find that VAO’s actually:

…store all the render-state that is specific to a particular Draw call, so long as that state isn’t in a ShaderProgram (i.e. isn’t a Uniform)

Shaders are the only place where VAO’s aren’t used, and it’s very easy to forget this and have your code break in weird and wonderful ways. If you find Shader state seems to be leaking between DrawCalls, you almost certainly forgot to eplicitly switch ShaderProgram somewhere.

Note: this applies not only for rendering a DrawCall, but also for setting the value of the Uniform. You must call glUseProgram() before setting a Uniform value.

Because we often want to set Uniform values outside of the draw loop – e.g. when configuring a Shader at startup – I added a method that automatically switches to the right program for you. If you use this repeatedly on every frame, it’ll damage performance, but it’s great for checking if you’ve forgotten a glUseProgram somewhere:

GLK2ShaderProgram.m:

-(void) setValueOutsideRenderLoopRestoringProgramAfterwards:(const void*) value forUniform:(GLK2Uniform*) uniform 
{
	GLint currentProgram;
	glGetIntegerv( GL_CURRENT_PROGRAM, &currentProgram);
	
	glUseProgram(self.glName);
	[self setValue:value forUniform:uniform];
	glUseProgram(currentProgram);
}

Uniforms in your Game Engine

Game-Engine code has to treat Uniforms a little specially:

  • Unlike Vertex-Attributes, we tend to update Uniforms very frequently – often every frame.
  • We have to reference them by human-readable name (instead of simply ramming them into a homogeneous C-array).
  • We have to remember to keep calling glUseProgram() each time we write to a Uniform, or render a DrawCall.

You can layer it in fancy OOP wrappers, but ultimately you’re forced to have a Hashtable/Map somewhere that goes from “human-readable Uniform name” to “chunk of memory holding the current value, that can be sent to the GPU whenever it changes”.

Desktop GL is different; they modified the GLSL / Shader spec so that it allowed for slightly tighter integration of variables with your main app. Sadly, they didn’t include those features in GL ES

I’ve tried it a few different ways, but the problem is that you have to store a “string” mapping to a “C-struct”. Worse, OpenGL ignores the value of the struct, it only uses the memory-address. So that struct has to be at a stable location in RAM.

This might not seem a problem, but Apple’s system for storing structs in NSDictionary is to create and destroy them on-the-fly (on the stack) – so there’s never a stable memory-address.

Here’s my current best workaround for ObjectiveC OpenGL apps…

An intelligent, C-based, “Map” class

Observations:

  1. All our data will be structs
    1. C can easily store data if it’s homogeneous
    2. and OpenGL only has circa 10 unique structs for Uniforms
    3. …so: 10 arrays will be enough to store “all possible” Uniform values for a given ShaderProgram
  2. Data is unique per ShaderProgram
    1. C arrays-of-structs can’t change size once created :(
    2. But: the Uniforms for a ShaderProgram are hard-coded, cannot change at runtime
    3. …so: we can create one Map per ShaderProgram, and we know it will always be correct
  3. C-strings are horrible, and we want to avoid them like the plague
    1. We can easily convert C-strings into Objective-C strings (NSString)
    2. Apple’s NSArray stores NSString’s, and returns an int when you ask “which slot contains NSString* blah?”
    3. C allows int’s for direct-fetching of locations in an array-of-structs
    4. …so: we can have a Data Structure of NSString’s, and a separate C-array of structs, and they never have to interact

GLK2UniformMap.h:

@interface GLK2UniformMap : NSObject

+(GLK2UniformMap*) uniformMapForLinkedShaderProgram:(GLK2ShaderProgram*) shaderProgram;

- (id)initWithUniforms:(NSArray*) allUniforms;

-(GLKMatrix2*) pointerToMatrix2Named:(NSString*) name;
-(GLKMatrix3*) pointerToMatrix3Named:(NSString*) name;
-(GLKMatrix4*) pointerToMatrix4Named:(NSString*) name;
-(void) setMatrix2:(GLKMatrix2) value named:(NSString*) name;
-(void) setMatrix3:(GLKMatrix3) value named:(NSString*) name;
-(void) setMatrix4:(GLKMatrix4) value named:(NSString*) name;

-(GLKVector2*) pointerToVector2Named:(NSString*) name;
-(GLKVector3*) pointerToVector3Named:(NSString*) name;
-(GLKVector4*) pointerToVector4Named:(NSString*) name;
-(void) setVector2:(GLKVector2) value named:(NSString*) name;
-(void) setVector3:(GLKVector3) value named:(NSString*) name;
-(void) setVector4:(GLKVector4) value named:(NSString*) name;

@end

You create a GLK2UniformMap from a specific GLK2ShaderProgram. It reads the ShaderProgram, finds out how many Uniforms of each GLType there are, and allocates C-arrays for each of them.

Later, you can use the “setBLAH:named:” methods to set-by-value any struct. Importantly, this does NOT take a pointer! This ensures you can create a struct on the fly – all of Apple’s GLKit methods do this. e.g. you can do:

...
GLK2UniformMap* mapOfUniforms = ...
...
[mapOfUniforms setVector3: GLKVector3Make( 0.0, 1.0, 0.0 ) named:@"position"];
...

Connecting the GLK2UniformMap to a GLK2DrawCall

In previous posts, I created the GLK2UniformValueGenerator protocol. This is a simple protocol that uses the same method signatures as used by OpenGL’s Uniform-upload commands.

We etend GLK2UniformMap, and implement that protocol, to create something we can attach to a GLK2DrawCall, and have our rendering do everything else automatically:

GLK2UniformMapGenerator.h:

@interface GLK2UniformMapGenerator : GLK2UniformMap <GLK2UniformValueGenerator>

+(GLK2UniformMapGenerator*) generatorForShaderProgram:(GLK2ShaderProgram*) shaderProgram;
+(GLK2UniformMapGenerator *)createAndAddToDrawCall:(GLK2DrawCall *)drawcall;

@end

Internally, the methods are very simple, e.g.:

GLK2UniformMapGenerator.m:

@implementation GLK2UniformMapGenerator
...
-(GLKMatrix2*) matrix2ForUniform:(GLK2Uniform*) v inDrawCall:(GLK2DrawCall*) drawCall
{
	return [self pointerToMatrix2Named:v.nameInSourceFile];
}
...

NB: in the protocol, I included the GLK2DrawCall that’s making the request. This is unnecessary. In future updates to the source, I’ll probably remove that argument.

Animated textures: the magic of Uniforms

Finally, let’s do something interesting: animate a texture-mapped object.

The sample code has jumped ahead a bit on GitHub, as I’ve been using it to demo things to a couple of different people.

Have a look around the project, but I’ve split into two projects. One contains the reusable library code, the other contains a Demo app that shows the library-code in use.

I simplified all the reusable render code to date into a Library class: GLK2DrawCallViewController (etends Apple’s GLKViewController)

I’ve also moved the boilerplate “create a triangle”, “create a cube” etc code into a Demo class: CommonGLEngineCode

The sample project – permanent link to branch for this article – has a simple ViewController that loads a snake image and puts it on a triangle:

AnimatedTextureViewController.m:

@interface AnimatedTextureViewController ()
@property(nonatomic,retain) GLK2UniformMapGenerator* generator;
@end

@implementation AnimatedTextureViewController

-(NSMutableArray*) createAllDrawCalls
{	
	/** All the local setup for the ViewController */
	NSMutableArray* result = [NSMutableArray array];
	
	/** -- Draw Call 1:
	 
	 triangle that contains a CALayer texture
	 */

	GLK2DrawCall* dcTri = [CommonGLEngineCode drawCallWithUnitTriangleAtOriginUsingShaders:
						   [GLK2ShaderProgram shaderProgramFromVertexFilename:@"VertexProjectedWithTexture" fragmentFilename:@"FragmentTextureScrolling"]];
...

That’s using the refactored CommonGLEngineCode class to make a unit triangle appear roughly in the middle of the screen.

Then we setup the UniformMapGenerator (no values yet):

...
	self.generator = [GLK2UniformMapGenerator createAndAddToDrawCall:dcTri];
...

NB: the generator class automatically detects requests for Sampler2D, and ignores them. Those are only used for texture-mapping, which we handle automatically inside the GLK2DrawCall class (see previous post for details).

...
	/** Load a scales texture - I Googled "Public Domain Scales", you can probably find much better */
	GLK2Texture* newTexture = [GLK2Texture textureNamed:@"fakesnake.png"];
	/** Make the texture infinitely tiled */
	glBindTexture( GL_TEXTURE_2D, newTexture.glName);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);
	glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);
	
	/** Add the GL texture to our Draw call / shader so it uses it */
	GLK2Uniform* samplerTexture1 = [dcTri.shaderProgram uniformNamed:@"s_texture1"];
	[dcTri setTexture:newTexture forSampler:samplerTexture1];
...

…again: c.f. previous post for details of what’s happening here, nothing’s changed.

...	
	/** Set the projection matrix to Identity (i.e. "dont change anything") */
	GLK2Uniform* uniProjectionMatrix = [dcTri.shaderProgram uniformNamed:@"projectionMatrix"];
	GLKMatrix4 rotatingProjectionMatrix = GLKMatrix4Identity;
	[dcTri.shaderProgram setValueOutsideRenderLoopRestoringProgramAfterwards:&rotatingProjectionMatrix forUniform:uniProjectionMatrix];
	
	[result addObject:dcTri];
	
	return result;
}

…ditto.

Finally, we now have to implement a callback to update our Generator’s built-in structs and ints and floats once per frame:

-(void)willRenderDrawCallUsingVAOShaderProgramAndDefaultUniforms:(GLK2DrawCall *)drawCall
{
	/** Generate a smoothly increasing value using GLKit's built-in frame-count and frame-timers */
	double framesOutOfFramesPerSecond = (self.framesDisplayed % (4*self.framesPerSecond)) / (double)(4.0*self.framesPerSecond);
	
	[self.generator setFloat: framesOutOfFramesPerSecond named:@"timeInSeconds"];
}

Run the project, tap the button, and you should see snakey skin scrolling along the surface of a 3D triangle:

Screen Shot 2014-02-20 at 01.51.20

From scales-on-a-triangle to realistic snake

The scrolling works by moving our offset across the surface of the triangle. Doing this with a Uniform means that the speed is constant relative to the corners of the triangle.

i.e. if you make the triangle smaller, it will take the same time to cover the distance, but it’s covering a shorter distance, so appears to move slower.

We’re getting the effect – for free! – of skin bunching up and stretching out. All you have to do is make your triangles shorter on the inside of a snake-coil, and longer on the outside.

If you model your snake the easiest possible way, this bunching will happen automatically. Simply take a cylinder and bend it with a transform – the vertex attributes (that force the texture to map across each triangle) won’t change, but the triangle sizes will, causing realistic bunching/stretching of the skin.