Instances Materials and Visiblity Buffer Rendering

Started by
2 comments, last by Sprue 9 months ago

Looking to move from deferred to a vis-buffer approach w/ compute and indirect-draw/dispatch, I have a multi-user many views on one machine situation (7-8 before surplus shadowmaps) that looks like it'd benefit from multiview culling like Engel (http://diaryofagraphicsprogrammer.blogspot.com/2018/03/triangle-visibility-buffer.html)​​ and other general GPU culling. I'm not quite grokking coping with IDs in a more general pipeline vs the trivial one in Engel which is pre-transformed soup, what I've got is basically an X-COM tile map of mesh instances whose tiles (or whole spans of tiles) are frequently swapped out.

If it's just straight drawing these generated index buffers, that's no big deal. If programmable pulling on SV_VertexID then cluster is implicit from the pulled indices if max-triangles-per-cluster aligned so there's no need to pass that then. I've seen other approaches that record the triangle index relative to the cluster and then use 24-bits for cluster, which is again ... everything is there and pretransformed into world-space in those.

I'm thinking I have to suck it up and swallow an R32G32 target for the visibility buffer to hold the instance and triangle-index then pull transform/material/base-offset info by instance ID.

If I had to I could do massive a buffer for the world geom of pre-transformed geometry and add some bitfields like in DOOM:Eternal to discard triangles to make the destruction work (be a headache, but I can make that work) or just deallocate the old clusters and pump some new ones in wherever there's a big enough span, then a bit in the framebuffer to denote “nah, dude, that's in the animated meshes output buffers, it's something else sourced elsewhere”, but - is a great big giant pretransformed set of clusters actually a win? Do instances just become a CPU side thing taking responsibility for their allocated clusters and dispatch to do whatever they need to in order to change GPU data to reflect state?

---

I guess what I'm really asking, is vaguely, what do these pipelines look like in real world usage? I'm not asking how to compact some draw-args or build a fragment list for material dispatch. I'm asking how you get a "hey, meshlet cluster here, you're 700 degrees Celsius right now, do your glowing shader thingy" instance param through the pipe and used by the right triangle/fragment at shade and not bungle everybody else using the same geometry?

Advertisement

fleabay said:
Is this more involved that passing the temp variable to the shader and letting the shader decide how to render the mesh?

??? That's the question I'm asking … I think? This subject sucks to talk about and is terms/definitions hell.

The compute culling pipe examples that exist that are open to reading to be referred to are crude and limited, I'd like to know what a pipe that meets real world situations, like animating a fresnel glow effect to a specific time phase looks like. What is the instance → material → cluster → triangle relationship and how is that chain established for shading to be performed correctly.

After rereading Burns, the Ubisoft Dawn Engine stuff (the thing where they gen the 16-bit depth buffer to use depth-equals test as a filter), and a bunch of deferred texturing materials I've sorted it out. It's entirely arbitrary. I got derailed by the arbitrariness and the example in TheForge's funkiness (or succinctness I suppose) of not doing more than it has to do, there's no need for it or the other code I've looked at to fuss with instances and jazz when it's just drawing a showroom.

Thanks rubber-ducks for reading me ramble and rant!

Injected ahead of the emitted triangle indices and R32G32_UINT it is because screw fussing over some whacked 1:14:9:8 packing of skinned-flag, instance, cluster-index, triangle-in-cluster-index gobbly-gook.

This topic is closed to new replies.

Advertisement