Quote:
Original post by C0D1F1ED
Quote:
Original post by justo
because then you've just repurposed your gpu to be an inefficient ppu!
It really isn't inefficient. It has massive SIMD processing capabilities and a fast memory controller. The PPU is very much like a downscaled version of Cell (i.e. a CPU with multiple SIMD units). So as far as I know the PhysX chip has nothing specific that makes it more suited for physics processing than anything else, and the GPU lacks nothing to efficiently do physics processing.
Quote:
reading back is very much a problem...it is one of the number one bottlenecks in any gpgpu program also utilizing graphics. the main problem is not physical bandwidth, but context switches/memory thoroughput while doing other operations (in game textures, etc). while i don't have a lot of experience with directx, glReadPixels, for example, is quite slow and will kill performance if you try and do it every frame.
Memory throughput is no problem with PCI-Express. Even the latest graphics cards don't use the full 16x bandwidth.
glReadPixels forces the graphics card to finish all rendering operations (i.e. synchronization), wich can take a while. For physics processing it doesn't have to synchronize graphics work. It's up to the driver to handle this efficiently but there is no fundamental limitation that would make accessing the GPU for physics processing any slower than accessing the PPU.
I think the issue here sort of centers around architecture. We have to look at what makes a GPU fast when it comes to graphics. Its because it processes data differently from that of a CPU. GPUs work well specifically with streaming data, especially when we do the exact same operation on all the data and only occassionally swap operations, or perform a state change. So, the whole graphics subsystem is high optimized at pushing pixel data in from one end and the spitting the end result out directly through to the display buffer, which then goes to the display. For all those who have done GPGPU, they all know that random memory access is hell for a GPU just because the whole streaming aspect of data is so optimized. We're not talking about random memory access to main memory, but rather texture memory, which is where all the data you need to work with is stored. So, the whole PCI express thing doesn't come into play. So, the problem is, if random memory access on the card is already slow in comparison to streaming access, then that actually becomes the bottle nexk when trying to get to the data. Also, for the most part, when the graphics card was designed, there was never any thought put into the possibility of streaming things off the card and back into main memory. There just wasn't a need and it just didn't make sense either. So, graphics has always been a one way trip, until recently.
So, then the question for ATI and nVidia is, do you sacrifice graphics performance on the hardware level, just so that you can add physics capability? Do you try to optimize random memory access and lose out on some of the optimizations done to speed up data streaming? And based on the theory of "no free lunch", you can pretty much say that the gain in physic performance will very well be proportional to loss of graphics performance.
As for HavokFX, from their campaign, it would seem that they are targetting SLI mode such that there is a semi-free GPU available to do physics, but wouldn't that kind of defeat the whole purpose of SLI mode? And that would just mean that I just spent almost the same amount of money for a graphics card that won't be a graphics card when I play games.
One of the selling points of PhysiX was also the ability accelerate real time skeletal animation and character animation, which involves alot of inverse kinematics. Inverse Kinematics involve alot of transformations that are interdependent. As of yet, I still don't see anyone doing GPGPU that can simulate that in real-time.