Started to Dive into the Compute Shader!

Published March 06, 2015
Advertisement
I admit personally that when I heard about Compute Shader that I thought it was an over complex odeal. It's actually fascinating because of the parallel possibilities. It's literally turbo charging your game's speed to create box blurs or any other post processing effect; even managing physics with particles.

I created a Unorder Access View then from that I was able to save the image you see below. It's simple example not over the top but gave me the basic idea on how a compute shader works. The texture I created for the UAV and SRV was 1024 and so for every 1 pixel to fill up the 1024 image; the gradiant ramp you see is the results of the XY threads.

conputeOutput.jpg

I'm digging the whole compute shaders and I'll be exploring the unlimited possiblities with them. I understand the compute shaders won't solve everything and it helps hide some stalling. I'm actually glad I looked more in depth into Compute Shaders.

This is just me overly exciting ranting biggrin.png

Edit: A new attached photo shows what's being rendered to a render target to compute shader as is not post process effect via compute shader.

conputeOutput.jpg

There's more to learn from this great experience I must admit! I look forward to it!
6 likes 4 comments

Comments

unbird

:lol: I remember doing exactly the same thing when I started: "Huh, what's all this SV_GroupThreadWhatever do ???"

March 06, 2015 11:33 PM
Paul C Skertich

Yeah I know right! It was all like what in God's name! Last night I grabbed a render target SRV and put it through the compute shader then saved it as a JPEG. I also came to the realization why sometimes creating a render target of R32G32B32A32_FLOAT is heavy on memory performance. A render target of R8G8B8A_UNORM does just fine. One thing I do have to do is in my engine is to enumulate all the display modes and save it a configuration file then have the program pick which one is suitable. Dispatching 64, 21, 1 threads for the constant buffer just to fill up the render target I'm not sure is any good because in the shader only 32 is the max apparently. After reading on MSDN the max dispatchable threads are higher than 64 but I'm not sure how well that will do with performance. I'll have to use a high resolution timer to see the difference.

March 07, 2015 03:44 PM
unbird

Not sure I follow. You probably confuse something here. The 32 "limit" (or 64 for AMD) is the so-called warp size (aka wave front), and to "max out" the GPU's performance a group size should be a multiple of this (i.e. the product x*y*z of [tt][numthreads(x,y,z)][/tt]).

Apart from that, don't bother too much just yet. I consider the switch to compute from normal shaders as at least as difficult to understand as the switch from regular programming to graphics shaders. Best start playing with available source (e.g. prefix sum). wink.png

March 07, 2015 05:41 PM
Paul C Skertich

Yeah the whole dispatch(x,y,z) and the whole numthreads[x,y,z] was tripping me up a bit! It is confusing to get a handle of because your dealling with each thread of x,y,z to fill up a texture. Where as the pixel shaders just deal with output the pixels to the render target or back buffer. The way of the shader language goes is different instead of returning float4 as in the pixel shader vs as bufferOut[dispatchThread.xy] = data[dispatchThread.xy]

March 09, 2015 04:46 PM
You must log in to join the conversation.
Don't have a GameDev.net account? Sign up!
Advertisement
Advertisement
Advertisement