how to calculate sine and cosine in c++ without STL

Started by
22 comments, last by MrRowl 4 years, 6 months ago

Here's a much more precise approximation (error less than 0.00001) :


// This works for 0 <= x <= pi
float approx_sin(float x) {
  float const pi = std::atan(1.0f)*4.0;
  float const four_over_pi_sq = 4.0f/(pi*pi);
  float const pi_halves = pi/2.0f;
  float const k1 = 0.0368071161f;
  float const k2 = 0.4665308893f;
  float const k3 = 0.0012713381f;
  return x*(pi-x)*(four_over_pi_sq-std::pow((x-pi_halves),2)*(k1-(x-k2)*(x-(pi-k2))*k3));
}

 

Advertisement
On 10/27/2019 at 7:29 PM, Bregma said:

Why bother with the union for i and f?

 

The union can be removed for greater speed. When I wrote this function, I used the union as a static member.


static union{int i; float f;} u;

which consumes less memory and maintains the same speed. The union takes the largest size bits. In this case the float type. The function can be better written in the form.

 


void sameRadian(const float& rad, float& c, float& s)
{
   int i; float f;

   i = int(rad * 0.31830988f + 0.5f);

   c = (i % 2)? 3.14159265f * i - rad - 1.570796f : rad + 1.570796f - 3.14159265f * i;
	
   i = int(rad * 0.31830988f);
	
   s = (i % 2)? 3.14159265f * i - rad : rad - 3.14159265f * i;
	
   if(c < 0.0f)
   {
      f = (1.273239f + c * 0.405284f) * c;
      c = (f + 1.0f) * f * -0.225f + f;
   }
   else
   {
      f = (1.273239f - c * 0.405284f) * c;
      c = (f - 1.0f) * f * 0.225f + f;
   }

   if(s < 0.0f)
   {
      f = (1.273239f + s * 0.405284f) * s; 
      s = (f + 1.0f) * f * -0.225f + f;
   }
   else
   {
      f = (1.273239f - s * 0.405284f) * s; 
      s = (f - 1.0f) * f * 0.225f + f;
   }
}

 

 

the union is unnecessary only delays the function.

 

On 10/28/2019 at 10:22 AM, alvaro said:

Here's a much more precise approximation (error less than 0.00001) :

On 10/28/2019 at 10:22 AM, alvaro said:

Here's a much more precise approximation (error less than 0.00001) :

 

 

Hola Álvaro

This function is almost 10 times slower than my current function and is also less accurate in some cases, for example "sine (30)" and still depends on the STL. If I wanted a precise but slow function, I would simply use the Taylor series for sine and cosine, which converges slowly but obtains the exact result.

descarga.png.0ae8f51a3e7a598d7736273b7b9ae4bc.png

10 times slower? How do you compile your code? I am using g++-7.4.0 with options "-O3 -march=native" and my function seems to be a bit faster than yours. Of course, I only handle angles between 0 and pi, and I only return the sine.

I am not sure what you think "the STL" means. Assuming you mean "the standard library", you can trivially replace my function calls to define constants by the constants themselves (at the risk of making the code harder to understand). The call to "std::pow" is just a multiplication. But what's the point of not using the standard library? Perhaps you said it earlier in the thread and I missed it.

 

The trigonometric math funcions live in cmath, now the question could be asked if the c standard library is part of the c++ template library; i thought that containers, iterators, algorithms etc. make up the STL, but i am probably wrong ?

The constants in @alvaro's version need not be computed on every function call (just sayin' :-)). Some of them are constants that live in the math library anyway. Or they can be precomputed with ludicrously high precision if necessary.

Speaking of templates, i find it nice to have the possibility of using the datatype that's appropriate for a situation without having to cast around. The implicit casts between int and float from the example may actually affect the outcome. Also, the fixation to float may be limiting.

Searching the web, there seem to be a plethora of possibilities to calculate trigonometry with speed xor precision.

Sparing 0.08 seconds on 10 million function calls (debug build, i7 3.6Ghz) isn't decisive for my situation. I'd rather shift functionality that needs a trig call (angle of view only as an example that comes to my mind) out of each iteration of the mainloop to the situation when the user actually changes the zoom. Trigonometry needed for meshes for example isn't decisive for execution time (done offline or asynchronously on load) but needs precision. Edit: camera view angle is computed (usually) 60 times per second. Trig calls aren't decisive there too i'd think.

The gnu scientific library, and probably others like Octave and similar sdks, offer different ways to calculate almost everything, mostly aimed at higher than "normal" precision.

But then again, it may be a sport to do these things on foot. I understand that ?

2 hours ago, alvaro said:

10 times slower? How do you compile your code? I am using g++-7.4.0 with options "-O3 -march=native" and my function seems to be a bit faster than yours. Of course, I only handle angles between 0 and pi, and I only return the sine.

I am not sure what you think "the STL" means. Assuming you mean "the standard library", you can trivially replace my function calls to define constants by the constants themselves (at the risk of making the code harder to understand). The call to "std::pow" is just a multiplication. But what's the point of not using the standard library? Perhaps you said it earlier in the thread and I missed it.

 

 

Hola Como estas! Alvaro

I think that the two functions cannot be compared because their function cannot calculate the sine at all angles and much less negative angles and also because it uses atan that is part of the STL. I think it's not worth it, I say it in a good way and I don't try to insult him. That is all I have to say.

 

 

Hola, Gustavo:

I don't know what the problem with using the standard library is (and it's not called "the STL": The STL was an old template library of containers, which was influential in the containers that eventually made it to the standard library; but here we are not even talking about that part of the standard library).

Anyway, here's a version of the function that works reasonably for a larger range of numbers, and it doesn't use any functions from the standard library:


float approx_sin_0_pi(float x) {
  float const k1              = 0.0368071161f;
  float const k2              = 0.4665308893f;
  float const k3              = 0.0012713381f;
  float const pi              = 3.1415926536f;
  float const four_over_pi_sq = 0.4052847346f;
  float const pi_halves       = 1.5707963268f;
  return x*(pi-x)*(four_over_pi_sq-(x-pi_halves)*(x-pi_halves)*(k1-(x-k2)*(x-(pi-k2))*k3));
}

float approx_sin(float x) {
  int i = static_cast<int>(x * (1.0f/pi));
  x = (i & 1) ? pi * i - x : x - pi * i;
  return x > 0.0 ? approx_sin_0_pi(x) : -approx_sin_0_pi(-x);
}

The polynomial approximation used here gives correct results for the important angles 0, pi/2 and pi. The constants k1, k2, k3 have been chosen to minimize the maximum error in the whole range. If you were to use the Taylor series formulas you posted earlier, you'd get a lot of precision around 0, but the error would be much larger near pi, if you use a polynomial of comparable degree.

Anyway, this is just a middle ground between speed and precision, which might be interesting in some cases.

 

Salud,

Álvaro.

 

 

One more version, and I'll go to bed, I promise. :)

 


#include <cmath>
  
float const pi = 3.1415926536f;

float approx_sin_0_pi(float x) {
  float const k1 = 0.0368071161f;
  float const k2 = 0.4665308893f;
  float const k3 = 0.0012713381f;
  return x*(pi-x)*(4.0f/(pi*pi)-(x-(pi/2.0f))*(x-(pi/2.0f))*(k1-(x-k2)*(x-(pi-k2))*k3));
}

float approx_sin_no_branching(float x) {
  x -= (2.0f*pi)*std::nearbyint(x*(0.5f/pi));
  return std::copysign(approx_sin_0_pi(std::abs(x)), x);
}

This version doesn't have any branching, so in situations where the branches are hard to predict (say, test the functions using random numbers between -10 and +10), this version is much faster, even though the call to `nearbyint' is not inlined.

 

On 10/29/2019 at 11:22 PM, gustavo rincones said:

the union is unnecessary only delays the function.

How does it delay exactly?

On 10/30/2019 at 12:02 AM, alvaro said:

But what's the point of not using the standard library?

Because they are using boost or some other home-grown library instead, perhaps? I'm just guessing. I happy about it because without STL the code is a perfectly viable, libc-independent C code, free of OOP overhead ?

On 10/30/2019 at 5:04 AM, alvaro said:

This version doesn't have any branching, so in situations where the branches are hard to predict

Not just that, with branching there's always a chance of CPU instruction cache miss, and a need for prefetching and decoding new instructions from memory. I think your last solution is perfect! It does not use STL, precise enough and it is as fast as possible. Well done!

Cheers,
bzt

39 minutes ago, bzt said:

Because they are using boost or some other home-grown library instead, perhaps? I'm just guessing. I happy about it because without STL the code is a perfectly viable, libc-independent C code, free of OOP overhead ?

I haven't followed all the details of the thread, so maybe I'm missing some context here. But isn't this thread about implementing trig functions by hand in C++?

As has been discussed elsewhere in the thread, 'STL' isn't the right term here. Also, I'm not sure Boost is something you'd use 'instead of' the standard library. It's more of an expansion of it (and of course various functionality has migrated from Boost to the standard library over time).

If we're talking about trig functions, I'm not sure how OOP comes into play. And I wouldn't just assume OOP introduces extra overhead. It could of course, but I think you'd have to evaluate that on a case-by-case basis. And only parts of the standard library use OOP to begin with. Lastly, not using the standard library doesn't leave you with OOP-free C - it just leaves you with C++ without the standard library.

Anyway, there of course may be reasons to use or not use a particular tool, but I'd question these particular reasons, at least in this case. (Just based on my initial impression at least - maybe I'm missing something.)

This topic is closed to new replies.

Advertisement