[D3D11] What is the most performant way to partially but frequently update a large vertex buffer?

Started by
0 comments, last by Holy Fuzz 2 years, 6 months ago

My Direct3D 11 game has a number of large vertex buffers (10K+ vertices) that get updated every frame, but on any individual frame only a subset of the vertices in the buffers are actually changed (perhaps ⅓ of them on average, but it varies greatly). The particular vertices that will be updated is not predictable ahead-of-time and changes frame-to-frame, making it impractical to split them into separate buffers or group them contiguously within a buffer.

So my question is: What is the most performant (i.e., uses the least amount of CPU time) way to update these vertex buffers every frame?

These are the methods I've tried:

1. Create the vertex buffer with D3D11_USAGE_DEFAULT and update it with UpdateSubresource from a copy of the vertex data stored in CPU memory. (The individual vertices are updated as-needed in the CPU copy, and then the whole thing is sent to the GPU.) With this approach, my game spends the majority of its CPU time within the UpdateSubresource call.

2. Create the vertex buffer with D3D11_USAGE_DYNAMIC and updating it by calling Map/memcpy/Unmap from a copy of the vertex data stored in CPU memory. (As with the previous approach, the individual vertices are updated as-needed in the CPU copy, and then the whole thing is sent to the GPU. My understanding of D3D11_USAGE_DYNAMIC is that it must be mapped using D3D11_MAP_WRITE_DISCARD, which is what forces me to keep a local copy as not all vertices are updated every frame.) With this approach, my game spends spends the majority of its CPU time within the memcpy call. This approach is slightly faster than the first approach. (My guess is because my game is generally CPU-bound, using a dynamic vertex buffer in this case is slightly preferable to default.)

3. Create the vertex buffer with D3D11_USAGE_DEFAULT and update it by calling CopyResource from another vertex buffer created using D3D11_USAGE_STAGING. This staging buffer itself gets updating by Maping it when the individual vertices are about to get modified, modifying the individual vertices within it, and then Unmaping it before copying. This approach is several times slower than either of the previous two approaches, for reasons that I don't really understand, since I thought that this was pretty much what UpdateSubresource does behind the scenes. (If someone can shed some light on this, I would be curious to know more!) In this approach, my game spends the vast majority of its CPU time within the Map call (I'm guessing it's stalled by the previous CopyResource call.)

Is there any approach you know of that might be better than these? I'm currently using the 2nd method above, and I would really like avoid the large amount of time my game is spending in that memcpy call if at all possible, but maybe that's just something I have to live with?

Thanks in advance!

This topic is closed to new replies.

Advertisement