Carmacks sqrt()
Last week i saw a topic in which someone posted a copy of the sqrt function used in quake 3 but i was at work and couldn''t grab it.
If anyone has it can i have a copy plz. Also how much of a loss in accuracy does it take for speed?
float InvSqrt (float x){ float xhalf = 0.5f*x; int i = *(int*)&x i = 0x5f3759df - (i >> 1); x = *(float*)&i x = x*(1.5f - xhalf*x*x); return x;} I believe it was inverse square root. To test it I used: int main(){ cout << "Using Carmacks: " << 1 / InvSqrt(3) << endl; cout << "math.h: " << sqrt(3) << endl;}
And it was pretty accurate.
I did a quick (and I believe innacurate, because I used SDL's getticks instead of other means due to inexperience) test using full compiler optimization in Dev-c++ of the time required to do 10,000,000 loops on both math.h's and carmacks.
Carmack: 28ms
math.h: 232ms
Hmm. That's an awfully large difference. Please correct me if I'm wrong.
[edited by - Ronin Magus on February 15, 2003 11:35:18 AM]
Doh. This method (Newton-Rhapson) has been used for years, and the theory behind it is hundreds of years old. It's a damn shame if people begin to call it Carmack's sqrt now, just because Carmack has a guru's status and a cool sounding name.
- Mikko Kauppila (whose name doesn't sound cool
[edited by - uutee on February 15, 2003 11:59:23 AM]
- Mikko Kauppila (whose name doesn't sound cool
[edited by - uutee on February 15, 2003 11:59:23 AM]
Well I just called it that because I''m no math person, I saw this post on the front page and remembered copying that code.
I wouldn''t say it is awefully large. Rather ten to one is what I generally use as a rule of thumb as to whether something is worth while. Within performance you can get 100 to 1 and 1000 to 1 improvements, i.e. bubble sort versus quicksort. With 10 to 1 you might as well go ahead and do it without compelling reasons otherwise. With 10% it better be a big part of your processing. 90% of 10% is just as big as 10% of 90%, but a whole lot less work as a general rule.
i''m not seeing how this works.
i tried to do it by hand and i''m getting weird numbers...
float x == 4
float xhalf == 0.5f * 4 == 2
int i == 2
i == 5f3759df (which is 1597463007 in decimal) - (2 >> 1)
== 1597463006
x == 1597463006
x == 1597463006*(1.5f - 2*x*x)
x == some ridiculously long number:
(8153093528352233345939813923)
and the inverse is (1/x): 1.22 e-28.
in other words not 2.
can someone tell me where i went wrong with the calculation?
i tried to do it by hand and i''m getting weird numbers...
float x == 4
float xhalf == 0.5f * 4 == 2
int i == 2
i == 5f3759df (which is 1597463007 in decimal) - (2 >> 1)
== 1597463006
x == 1597463006
x == 1597463006*(1.5f - 2*x*x)
x == some ridiculously long number:
(8153093528352233345939813923)
and the inverse is (1/x): 1.22 e-28.
in other words not 2.
can someone tell me where i went wrong with the calculation?
quote:Original post by Alpha_ProgDes
can someone tell me where i went wrong with the calculation?
You are using integers in your calculation.
The function relies on the different representations of integers and floats in memory. This is why there are pointer dereferences instead of simple assignments.
I know that in the SSE instruction set there is a similar instruction (rsqrt*s) that uses an on chip LUT.
Then just toss in a (rcp*s) to take a fast reciprocal of that, and you have the square root.
SSE is supported by Pentium 3/4 and Athlon XP, but you might want to check out 3DNOW! stuff for AMD processors because you might happen to find a quicker way of doing it.
Then just toss in a (rcp*s) to take a fast reciprocal of that, and you have the square root.
SSE is supported by Pentium 3/4 and Athlon XP, but you might want to check out 3DNOW! stuff for AMD processors because you might happen to find a quicker way of doing it.
There''s a thing I don''t understand. Why the: int i = *(int*)&x instead of: int i = (int)x;
Darookie mentioned something about it, but what are the details behind that method?
Could anyone explain it briefly?
Click NOW!
Darookie mentioned something about it, but what are the details behind that method?
Could anyone explain it briefly?
Click NOW!
This topic is closed to new replies.
Advertisement
Popular Topics
Advertisement