I don't understand render queues

Started by
7 comments, last by Perryade 1 year, 6 months ago

So render queues sort objects in a way as to minimize the amount of state changes. Say each object has a shader, texture and vbo associated with it. And say we sort the objects to draw based on these three parameters.

struct Object {
    //
    Shader* shader;
    Texture* texture;
    Vbo* vbo;
};

void SetStates(Object* object) {
    SetShader(object->shader);
    SetTexture(object->texture);
    SetVbo(object->vbo);
    Draw(object);
}

Now if we sort the objects based on shader, texture and vbo in that order. And call SetStates och each object. Won't we now have state call redundancies when two objects have the same state? If two objects objectA and objectB use the same shader and we run the below code, we have one redundant SetShader call. And in OpenGL at least redundant calls are not guaranteed to be optimized away.

SetStates(objectA);
SetStates(objectB);

Am I misunderstanding how render queues actually work?

Isn't it better to structure the rendering like below instead? That way we avoid redundant calls.

   for shader in shaders
      set shader
      for texture in textures
          set texture
          for vbo in vbos
             set vbo
             for object in objects:
                  draw object
Advertisement

Just track the current state and don't call SetShader if it's not necessary. It's really as simple as that. ?

@MJP Thanks for the reply! Can I ask if that how it's actually done in render queue?

@fleabay Yeah it was weird, my account got disabled twice and post removed, as I was about to post reply to MJP. I've been contacted by mods since. Apparently it had to do with spam filter being overly aggressive on new users. I think I'm in the clear now.

orange1 said:
@MJP Thanks for the reply! Can I ask if that how it's actually done in render queue?

In the internal loop that is executing the “Object", you declare variables that keep track of the current value of all the states. In your simplistic example, this could be:


void SetStates(Object* object)
{
	static Object currentState; // better store as member-variable if this method is a class-member
	
	Texture* newTexture = object ? object->texture : nullptr;
	if (currentState.texture != newTexture)
	{
		SetTexture(newTexture);
		
		currentState.texture = newTexture;
	}
}

Thank you @Juliean ! Now what I wonder is since with render queues we have to do conditional checks. Why not use a nested class structure for rendering instead? That feels like a much more elegant approach. What doesn't sit right with me with render queues is imagine we have thousands of objects sorted with the same vbo. Now despite them being sorted we have to do conditional check for each vbo.

With nested classes we could have a structure something like this. (I omitted render target and render pass classes to simplify the example)

class ShaderDraw {
     Shader* shader;
     class TextureDraw {
          Texture* texture;
          class VboDraw {
              Vbo* vbo;
              class ObjectDraw {
                 Object* object;
              };
          };
     };
};

Now what this approach we would avoid having to do conditional checks.

Is the issue with this approach that it's not flexible? For instance if we would like to change the order of classes or add new classes?

orange1 said:
Is the issue with this approach that it's not flexible? For instance if we would like to change the order of classes or add new classes?

The problem with your approach is, that in theorey it could be faster - but in practice it will be way, way worse for how CPUs and memory works. The thing is, conditional checks like those above are comparatively very cheap, especially if they change very infrequently - branch-predictors are very good at their job. On the other hand, creating a nested structure like the one you describes will require jumping around in memory while iterating over the different sub-sets of states. So yes, in practice having just one memory-area of “Drawables” or however you call them, the you process linearely once, will perform much better on chips.

The additional of this approach are yes, flexibility, as well as general simplicity (having to remove a drawable from one list instead of having to look through multiple), plus you'll also have lots more states (scissor-rects, alpha, zbuffer, render-targets etc…) which already fit nicely in the standard render-queue format, and which you'll eigther have to shoehorn in or get into very deep nesting (which might end up not even being able to depict everything, when you have permutations of nested states that don't match).

But the big main thing is really that there is no practical benefit to doing it your proposed way; all the checks will barely matter compared to the gains you have from helping your CPUs cache with a simple, linear memory structure.

@juliean Waited to reply to avoid the wrath of spam filter again. But this cleared everything up thanks very much!

This topic is closed to new replies.

Advertisement