D3D12 Resource State Tracking

Started by
8 comments, last by __MONTE2020 2 years, 10 months ago

I was wondering how do you guys handle resource state tracking when using multithreaded command list building? I am aware of a yt video on DirectX channel explaining one strategy, but I am curious to hear some other (if any) approaches to this problem.

Advertisement

We don't, really. Our engine is very explicit about how resources are used and where the barrier points are, which keeps things very lean at runtime. If do want something more automatic, a frame graph is one way to do it. These kinds of systems generally work by gathering all of the dependencies up-front in some kind of build step, and then repeating them every frame (as opposed to tracking everything to “discover" dependencies every frame, which can add overhead). The best choice is not necessarily straightforward, since there a trade offs here between runtime cost, ease-of-use, engineering time, ability to quickly add new features, etc.

@MJP Thanks for the reply. What about a scenario where one resource (e.g. depth buffer) is used on two threads running concurrently, and they both have to transition to some state but to do that they have to know “before” state which is not obvious if the other thread already transitioned the resource to some other state (different from the starting one)? Wouldn't that lead to errors? Do you completely avoid those cases? Maybe you always transition resource to their starting state ( I am not sure if this would work, also not sure about performance)?

If you're recording two different passes in separate threads you probably have 1 of 2 scenarios:

  1. These passes will run synchronously on the GPU so there is a single linear progression of state for your depth buffer (e.g. depth-write → read-only).
  2. These passes run asynchronously on the GPU (maybe async compute?).

Case 2 is illegal as two passes can't be accessing the same resource in different states at the same time, so case 1 is the one you need to solve.

I would not recommend storing “current” resource state on your actual graphics object abstraction for your depth buffer because you'll run into your current problem: if you record from multiple threads, what actually is the previous or next state of the resource? Older APIs could get away with this because 1) you generally couldn't record from multiple threads at once and/or 2) they would resolve resource transitions at command submission time in a separate pass. You can't really do the latter since that happened in the driver and you can't go back and modify your DX12/Vulkan command buffers after recording them, so you need to know this information at recording time.

Either you as the programmer know how your frame is structured so you hard-code these state transitions when recording your render passes (in separate threads or in the same thread, doesn't matter), or you have some higher-level abstracted code that resolves resource state transitions when setting up your render passes (like a frame graph). For most projects the former is honestly probably enough, but if you want a more general solution you need some way to abstract out how resources are accessed in your render passes so that you can have some code automatically resolve resource transitions when beginning/ending passes in separate threads.

@__MONTE2020 in our case the states are all known, and effectively hard-coded into the threads that generate the commands. For depth buffers, they will typically be in a “loop” where every frame they begin with the same resource state, and transition to another. As an example, a depth buffer might always begin in a readable state. The first command list to write to it transitions it to writable, then a later command list transitions it to a readable state so that a shader can access it. The process then repeats every frame. This can certainly lead to errors if you mess something up, but the debug layer helps catch these sorts of things.

Thanks to both of you. Just to be clear, here is a simple scenario:

resource R has state S1 in the beginning of the frame

command list C1 needs R in state S2

command list C2 needs R in state S3

C1 and C2 are built each on their own thread. If I call ExecuteCommandLists(2, {C1,C2}), i.e. C1 is before C2 in an array,

then it's ok to call ResourceBarrier(R,S1,S2) on C1 and ResourceBarrier(R,S2,S3) on C2?

__MONTE2020 said:

then it's ok to call ResourceBarrier(R,S1,S2) on C1 and ResourceBarrier(R,S2,S3) on C2?

I believe so, yes. I don't have much experience with DX12, but that is how I would do it in Vulkan which should be similar enough.

Yes that's totally fine. Ultimately the GPU doesn't even know that you've built those command lists in two different threads on the CPU: it will read the first command list followed by the second in serial order, and will perform the necessary actions required for the transition barriers in the order they're found in the command lists.

Thanks for the confirmation, I managed to implement it now knowing that resource transitions aren't complicated as I thought

This topic is closed to new replies.

Advertisement