SVOGI Implementation Details

Started by
121 comments, last by Josh Klint 1 year, 7 months ago

So far my results testing with motion have not been good. Trailing of objects in motion is quite bad. I can adjust settings to make the pixel changes more abrupt, but then I get bad flickering in the voxelization data.

I am wondering if this might be more of a semi-dynamic feature that should only be used with static geometry. What has been everyone's experiences with motion in techniques like this?

10x Faster Performance for VR: www.ultraengine.com

Advertisement

I wanted to avoid this problem by using smooth signals. So instead binary voxelization, i had gradual density. A moving object caused no discontinuities or popping, just gradual changes of density, colors, normals, etc.
Probably not practical, but i'll explain anyway.

To generate the voxelization, i have scattered surfels which i already had into the volume. So each surfel contributed to 8 cells.
I did this single threaded on CPU, and i did not worry about write hazards from MT or GPU implementation. That's one problem.
The other is that surfel resolution should match volume grid resolution, so we get no holes if surfel resolution would be too low.

Movement was fine even with low grid resolution, so maybe it's an idea.
You'd need to precompute the surface samples, have some acceleration structure to find them, then a CS responsible for a block of volume could do the scattering quickly with atomic adds to LDS.

I don't know what games using voxel GI do. Maybe they treat dynamic objects differently. Ignoring them? Capsule representation to get at least occlusion? No idea.
From tech demos i often saw popping effects of voxels turning on or off, but i never saw this happening in games.

You've practically hit the problem I had major problems with and was unable to solve it properly.

I generally use sort of temporal filter, here are examples without and with (3 variants):

These videos are from the past and they had quiet a lot of lighting - which makes most of the issues almost invisible. In general - it's good enough as long as your objects are not too small (compared to grid density) and they don't contribute as much to your scene. With small objects and lower grid density it is a major problem, let me show you a third video for that (sorry I had colliders display enabled):

Sorry for spamming videos - I wanted to show how it looks in motion (and still images are not really good for that). So what's the solution I used in addition to temporal filtering? Higher density voxel grid, and generally lighter environments (if possible - but that depends on design of the scene) - the first diminishes the effect of ‘jumping’ and the latter diminishes the impact of single object on global illumination. In short - there is no magic solution, just well tuned parameters.

Honestly I don't know how to solve this, even if you use atomics and accumulate color per voxel it's going to fail because with sparse enough grid your object is like “jumping” with it's GI information (temporal filtering just makes jumping change into gradual increase and decrease - which is more acceptable, but far from useful). The only solution that can solve this is just use higher discretization (i.e. denser grid). This is one of the pitfalls of this technique. Dynamic objects hard - temporal filtering will make lighting look weird sometimes in motion. I do know Kingdom Come: Deliverance used it at some point (at least they did in Beta), and if I'm not mistaken Mafia: Definitive Edition also uses it. Other than that there were not many examples.

Also keep in mind that I'm using Sponza here to test GI - using a game like environment (with particles, good textures, etc.) is going to yield more acceptable results even with more sparse voxel grid. This counts double when player moves and you can hide artifacts into motion blur and other tricks.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

This is from an earlier implementation that did voxelization on the CPU. I don't have problems with trailing because it's all re-voxelized on the CPU every few frames, but you can see the barrel reflecting itself inappropriately, and other problems:

I have found the problem is even worse with scenes with a lot of diffuse GI. A small change in the scene voxels turns into a big change in downsampled mipmaps and the GI diffuse lighting result, so you have a lot of flashing areas, exactly like what @vilem otte shows in his third video above.

I think the solution is to rasterize each dynamic object into a 3D texture once, and then store some kind of bounding volume hierarchy for all the dynamic objects in the scene. Instead of constantly re-rasterizing objects, you just transform the ray by the 4x4 matrix for that object when it intersects that AABB. Animated characters could be done, but you would need to break them apart into their limbs and make each limb a separate moving voxel grid. Also, dynamic objects should not contribute to the diffuse GI calculation, just specular reflection.

Performance for a system such as that is a big question, and at that point you might be better off just going with full raytracing. It makes me wonder what would be possible if someone created specialized hardware for voxel data.

That's probably functionality I don't need to include in the 1.0 release of this feature. I will just make the initial version reflect the static scene, which looks great and is a huge improvement over environment probes.

10x Faster Performance for VR: www.ultraengine.com

Josh Klint said:
I think the solution is to rasterize each dynamic object into a 3D texture once, and then store some kind of bounding volume hierarchy for all the dynamic objects in the scene. Instead of constantly re-rasterizing objects, you just transform the ray by the 4x4 matrix for that object when it intersects that AABB. Animated characters could be done, but you would need to break them apart into their limbs and make each limb a separate moving voxel grid. Also, dynamic objects should not contribute to the diffuse GI calculation, just specular reflection.

I was considering similar solution at once point - having a bounding volume hierarchy as top-level acceleration structure, and 3D textures storing voxels as bottom level. On paper that would work (and keep in mind - multi-level BVHs and traversal are suitable for real time ray tracing), the problem is not even ray traversal through 3D texture (that's straight forward), the problem is cone tracing.

Traversing cone through voxels is trivial - you literally just sample 3D texture with linear interpolation between its miplevels (you can use variants that were used in original cone tracing gi paper - but if you boil it down it is always this). The problem is cone tracing through BVH.

You see - ray tracing through BVH is somewhat trivial - you either hit child A, child B, neither or both (in either direction A-B and B-A). Children which you hit needs to be evaluated (ideally in order) and once you find terminating condition you terminate the ray. The test is simple Ray-AABB calculation. Tracing a cone through BVH is going to be tricky. Analytical Cone-AABB is not trivial (plus terminating conditions are not as simple as in case of ray). Numerical would be the analogy of sampling - you could just sample spheres at growing distance from origin, test those against BVH and if any leaf node is intersected - then accumulate voxel data at certain miplevel from it. What would be the terminating condition though?

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

On further reflection, I am certain such a system would run fast and work just fine.

  • The BVH does not replace the cascaded volumes. BVH is for dynamic geometry, cascaded volumes are for static.
  • The BVH is just a flat 3D grid. Each grid cell stores a list of dynamic objects that intersect that cell.
  • You do not perform ray tracing through the BVH. You just take samples and add the result to the sample from the cascaded volumes, just like you were doing before. If you want to get fancier, you could find the overlapping volume of the dynamic object's bounding sphere and the sampling sphere, and weight the dynamic object sample contribution by that.
  • Performance impact would be minimal. For most rays, it's just a few extra texture samples per step. Big samples would encompass more dynamic objects and require more extra samples, but you have relatively few of those.

I do not know if this would work well for diffuse GI, but I am leaning towards diffuse GI being something that is only affected by static geometry, that gets gradually updated in the background when the camera moves. This would give you good specular reflection of dynamic objects with good performance.

My formula for adding samples to the ray result is like this:

accumulatedLight.rgb += (1.0f - accumulatedLight.a) * coneSample.rgb * coneSample.a;
accumulatedLight.a += coneSample.a;
if (accumulatedLight.a > 1.0f) break;

10x Faster Performance for VR: www.ultraengine.com

I was just trying this UE5 Matrix demo, which uses such a system to trace local SDF volumes per object.
Rofl. Obviously my 10tf GPU + 8core is not enough to achieve playable framerate. It runs at something like 20 fps or less.
Sadly no settings to tweak lighting. I'm sure Lumen is the issue. I've heard with RT GPU they now replace all their SDF tracing with DXR, and i would wonder how much this helps fps.
Seeing one dev after another switching over to UE5, i guess my days as gamer are counted. I won't sell our car to afford RTX. :D

JoeJ said:
I was just trying this UE5 Matrix demo, which uses such a system to trace local SDF volumes per object.

I wonder, how do they handle the BVH traversal itself (you could have function determining distance to both child nodes for given BVH node though - so that's probably it). I honestly was hoping to try the SDF approach one day - but I never actually wrote SDF generation on GPU, and wrote down BVH logic with SDF in mind. Damn, you're making my "Try to do" list getting longer and longer.

JoeJ said:
Seeing one dev after another switching over to UE5, i guess my days as gamer are counted. I won't sell our car to afford RTX. :D

Darn, and here I am having a car, Radeon 6800 … but the hunger and no food seems to bother me a bit.

My current blog on programming, linux and stuff - http://gameprogrammerdiary.blogspot.com

Vilem Otte said:
I wonder, how do they handle the BVH traversal itself (you could have function determining distance to both child nodes for given BVH node though - so that's probably it).

There is a very interesting advantage about SDF:
Imagine we have many overlapping models, like the first UE5 demo had, where they did a lot of kitbashing to model the cave.
Traditional raytracing tanks here, because the ray needs to traverse all overlapping models down to the closest intersection. Only after that you know the closest intersection of all models.
Per object SDF does much better there: You can compare distance on entry point of all models, and descend only the single model with the smallest distance.
(EDIT: I just realize you need no traversal stack either for the same reason, because you never need to traverse multiple children, just the closest? Neat.)

That's probably the reason why in those demos RTX gave no speedup. May be very different with the city demo which has almost no overlaps.

Idk how they organize SDF models to find them, but i guess they use a simple regular grid. In the city demo you see a harsh transition at some distance where no more GI affects are present.

This topic is closed to new replies.

Advertisement