Advertisement

Optimization - any way to go faster?

Started by February 09, 2001 04:04 AM
6 comments, last by avianRR 23 years, 9 months ago
I have a program that displays a terrain I''ve set it up so I can change weather or not to use array lists I can also set the program to put this into a display list. I can therefore sellect the best combination of display/array lists for the card it''s on. but I''m still not pleased with the speed of this code. If anyone could please help me find a way to make this run faster I''d appreciate it! The performance I''m currently getting is... windowed ~200fps @ 16x16 = (512 tris in 16 strips) = 102k tris/sec ~145fps @ 32x32 = (2048 tris in 32 strips) = 296k tris/sec ~65fps @ 64x64 = (8192 tris in 64 strips) = 532k tris/sec ~20fps @ 128x128 = (32768 tris in 128 strips) = 655k tris/sec ~5fps @ 256x256 = (131072 tris in 256 strips) = 655k tris/sec ~1.25fps @ 512x512 = (524288 tris in 512 strips) = 655k tris/sec fullscreen ~415fps @ 16x16 = (512 tris in 16 strips) = 212k tris/sec ~335fps @ 32x32 = (2048 tris in 32 strips) = 686k tris/sec ~88fps @ 64x64 = (8192 tris in 64 strips) = 720k tris/sec ~22fps @ 128x128 = (32768 tris in 128 strips) = 720k tris/sec ~5.5fps @ 256x256 = (131072 tris in 256 strips) = 715k tris/sec ~1.35fps @ 512x512 = (524288 tris in 512 strips) = 710k tris/sec On my card the array lists don''t seam to make a difference. And when I look at the numbers it actually looks like a good performance untill you consider I need to do the 128x128 at 60 fps with other stuff being drawn as well. I seam to be hitting a performance limit at 655K/720K tris/sec wich seams a little low for a GeForce2MX. Here''s the code I''m useing... if(useArrays){ glEnableClientState(GL_VERTEX_ARRAY); glEnableClientState(GL_NORMAL_ARRAY); glEnableClientState(GL_COLOR_ARRAY); glVertexPointer(4,GL_DOUBLE,0,Vertex); glNormalPointer(GL_DOUBLE,4,Normal); glColorPointer(4,GL_FLOAT,0,Color); unsigned int *order = new unsigned int[2 * Width]; for(int w = 0;w < Width;w++){ order[ w*2 ] = w; order[(w*2) + 1] = w + Width; } for(int y = 0;y < Height-1;y++){ glDrawElements(GL_TRIANGLE_STRIP,2*Width,GL_UNSIGNED_INT,ℴ[0]); for(int w = 0;w < Width*2;w++) order[w]+=Width; } glDisableClientState(GL_VERTEX_ARRAY); glDisableClientState(GL_NORMAL_ARRAY); glDisableClientState(GL_COLOR_ARRAY); } else{ for(int x = 0;x < Width-1;x++){ int x1 = x + 1; glBegin(GL_TRIANGLE_STRIP); // draw first triangle Color[x].glColor(); glNormal3dv(&Normal[x].X); glVertex3dv(&Vertex[x].X); Color[x1].glColor(); glNormal3dv(&Normal[x1].X); glVertex3dv(&Vertex[x1].X); for(int y = 0;y < Height-1;y++){ int y1 = y + 1; Color[x + (y1 * Width)].glColor(); glNormal3dv(&Normal[x + (y1 * Width)].X); glVertex3dv(&Vertex[x + (y1 * Width)].X); // draw second triangle Color[x1 + (y1 * Width)].glColor(); glNormal3dv(&Normal[x1 + (y1 * Width)].X); glVertex3dv(&Vertex[x1 + (y1 * Width)].X); } glEnd(); } }
------------------------------Piggies, I need more piggies![pig][pig][pig][pig][pig][pig]------------------------------Do not invoke the wrath of the Irken elite. [flaming]
- Look at the performace FAQ from nvidia.
- You do not have to enable/disable client state each frame
- You do not have to call gl**Pointer each frame
- Put the the vertex array in a display list
Advertisement
The very first thing you should do is take glBegin() and glEnd() out of the loop altogether.

Also, calculate (y1 * Width) once at the start of each loop and use that variable instead of calculating it six times.

And replace (w * 2) with (w << 1).

That''s a start at least.



"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..."
"NPCs will be inherited from the basic Entity class. They will be fully independent, and carry out their own lives oblivious to the world around them ... that is, until you set them on fire ..." -- Merrick
Try to limit the number of calculations you include in the loop. Any calculation which occurs more than once should be replaced with a variable defined only once at the beginning, like Morf said. Also, any variables which need be defined only once on startup should, of course, not be in the loop at all but in the initialisation code. You should also put in some frustum culling, this should give you a 40% boost on a terrain engine. Also make sure you have the latest Detonator 3 drivers.

For info, I get 80 fps @ 640x480 with 32768 triangles (2''560''000 tris/sec) on a GeForce I with a Celeron 2 @ 920 MHz, using vertex arrays. I believe the GeForce MX has a similar or slightly better GPU. Also what processor are you using? If you''re running on a 200 MHz processor no need to look any further...
Keermlack: It''s a pII350Mhz so my results are proportional to what your getting That could mean that I''ll have to require a faster system for the finished product (won''t be able to aford that for a while my gas bills this winter are triple what they were last year! ) and with the end requirements I''ll end up haveing a min system of about a PIV1.5Ghz processor

Morfe: I have already optimize other code by removeing a lot more mult''s than are in this code and the difference is neglageble (0.1% increese from removeing 70K + mults per frame! the compiler is doing a lot of that kind of optimazation already.

Anonomus poster: If I don''t have to call gl**Pointer every frame how would I do that? You might not be considering the fact that I have several lists (this code is run on multiple data sets). And I''m already putting the array in a display list! (only <5% gain) To be honist this code is only called once putting the whole thing in a display list. And moveing the client state enable/disable to the init code didn''t make any difference.

In short it looks like I may have optomized my code as best I can but I''m hitting a barrier with my processor. And to think a month ago I bought the new vid card because my processor was waiting on my old one! how ironic!

Thanks for all the info I think I know what the problem is now, just cant aford to fix it.
------------------------------Piggies, I need more piggies![pig][pig][pig][pig][pig][pig]------------------------------Do not invoke the wrath of the Irken elite. [flaming]
Well, you could be dirty and write card specific code...
Advertisement
Don''t tempt me, I wrote svga libs for dos years ago...
As for my question I''ve made several discovery''s
forgot to put in glEnable(GL_CULL_FACE);
gave me +15% performance boost.
was useing doubles for high presision in another part of the program that was needed, I found a way to use floats and reference the height field to it''s origin rather than the data origin and still have the neccissary precision I need.
gave me a +50% performance boost.
The result is that I''m now running at 12.5fps from the previous 5fps
12.5fps @256x256 = 131072 tris = aprox. 1.6M tri''s/sec

What I''ve learned from this as far as Geforce cards go''s is that
double''s in an array list will kill performance by 50-75%
double''s in general with OGL are bad 10-25% loss.
Array lists provide more of a boost than display lists.
array lists in a display list are worse than rendering directly.

Note: that these observations are based on a system running a GeForce2 MX and GeForce2 GTS. Other chip set''s will behave differently.
------------------------------Piggies, I need more piggies![pig][pig][pig][pig][pig][pig]------------------------------Do not invoke the wrath of the Irken elite. [flaming]
You might want to take a look at this: http://www.mesa3d.org/brianp/sig97/perfopt.htm

It may not help you, but maybe it will?


http://www.gdarchive.net/druidgames/

This topic is closed to new replies.

Advertisement