Help with SAT collision implementation

Started by
3 comments, last by Alberth 1 year ago

Hello, everyone

I'm trying to implement the SAT collision test. I've read articles about how it works and want to write a program. I've decided to use C++.

I'm following a tutorial, but not sure about the optimization. I want to make this code as fast as possible. It may be C++ associated mistakes or something in the algorithm.

Code:

#include <iostream>
#include <cmath>
#include <deque>


struct Projection{
	Projection(double a, double b): min(a), max(b){}	
	double min;	
	double max;	
};

struct Axis{
	std::pair<float, float> p;	
	double dot(const std::pair<float, float>& v) const{
		return v.first*p.first + v.second*p.second;	
	}
};

struct Shape{
	std::deque<std::pair<float, float>> vertices;

	Projection project(const Axis& axis){
		double min = axis.dot(vertices[0]);
		double max = min;
		for (size_t i = 1; i < vertices.size(); i++) {
			// Normalize the axis
			double p = axis.dot(vertices[i]);
			if (p < min){
				min = p;
			} else if (p > max){
				max = p;
			}
		}
		return Projection(min, max);
	}	
};


std::pair<float, float> calc_normal(const std::pair<float, float>& v){
	return std::pair<float, float>(-v.second, v.first);
}

void calc_axes(Shape& shape, std::deque<Axis>& axes){
	for (size_t i = 0; i < shape.vertices.size(); i++) {
		// get the current vertex
		std::pair<float, float> p1 = shape.vertices[i];
		// get the next vertex
		std::pair<float, float> p2 = shape.vertices[i+1];
		if(i == shape.vertices.size() - 1){
			p2 = shape.vertices[0];
		}
		// subtract the two to get the edge vector
		std::pair<float, float> edge = p2;
		edge.first -= p1.first;
		edge.second -= p1.second;
		Axis a;
		a.p = calc_normal(edge);
		axes.push_back(std::move(a));
	}	
}

bool overlap(const Projection& p1, const Projection& p2){
	return !(p1.max < p2.min || p2.max < p1.min);
}

bool check_inter(Shape& shape1, Shape& shape2){
	std::deque<Axis> axes1;
	calc_axes(shape1, axes1);
	for(size_t i = 0; i < axes1.size(); ++i){
		Projection p1 = shape1.project(axes1[i]);
		Projection p2 = shape2.project(axes1[i]);
		if(!overlap(p1, p2)){
			return false;	
		}
	}
	std::deque<Axis> axes2;
	calc_axes(shape2, axes2);
	for(size_t i = 0; i < axes2.size(); ++i){
		Projection p1 = shape1.project(axes2[i]);
		Projection p2 = shape2.project(axes2[i]);
		if(!overlap(p1, p2)){
			return false;	
		}
	}
	return true;
}

int main(){
	srand(time(0));

	Shape shape1;
	shape1.vertices.push_back(std::pair<float, float>(0.f, 0.f));
	shape1.vertices.push_back(std::pair<float, float>(1.f, 1.f));
	shape1.vertices.push_back(std::pair<float, float>(0.f, 1.f));
	
	Shape shape2;	
	shape2.vertices.push_back(std::pair<float, float>(1.f, 0.f));
	shape2.vertices.push_back(std::pair<float, float>(2.f, 1.f));
	shape2.vertices.push_back(std::pair<float, float>(1.f, 1.f));
	
	bool res = check_inter(shape1, shape2);
	std::cout << res << "\n";	
}

Thank you.

None

Advertisement
  • Don't use std::deque, there's no need in this code. Use std::vector instead. std::deque is not implemented in an efficient manner.
  • Don't create a std::deque (or std::vector) inside the function. This means every function call allocates memory (slow). You can reuse the data structure by storing it somewhere persistent and clear() when you are done using it. You could wrap this code in a class, which has a std::vector member variable that is reused.
  • Make a Vector2 class instead of std::pair<float,float>, with overloaded arithmetic operators. This will make the code cleaner and less likely for mistakes.
  • Don't mix double and floating point arithmetic. You use double in some places and floats in others. Keep it consistent to avoid type conversions. Ideally, you can make it a template so that it works for both types. (though I stick to just floats in practice)

There's a few things that come to mind:

  1. std::deque is not the most efficient container. Especially since you don't use the pop_front and don't modify the vertices dynamically in the algorithm, you should just use std::vector for the “vertices”. Also in calc_axes, the amount of elements seems to be fixed (= to the vertices), so you should also use a vector and reserve it to the correct size.

Everything else is dependant on you using at least c++11:

  1. Eigther reuse the same container for all calls of calc_axes, or just return it from the function (instead of passing it as a parameter). The way you do it know offers no benefit, and in c++11 > returning a local variable uses at worst a move (which doesn't allocate) and should normally just create the variable outside anyways.
  2. Use range-based for: “for (const auto axis : axes2)” instead of index-based loops. This can be more efficient since the “size” parameter is cached, unlike what you do now where it will be called each loop-iterations. While a smart optimizer might be able to do that caching itself, this depends on it being able to see entire code, so as soon as you have a function that is not in the same translation unit, or a callback, it will pretty much lose the ability to do so. range-based for is optimal in terms of performance and ease of use, only reason not to use it is if you need to modify the container while iterating.
  3. Even if its just the setup-code, you should use emplace_back and reserve for the vertices:
Shape shape1;
shape1.vertices.reserve(3);
shape1.vertices.emplace_back(0.f, 0.f);
shape1.vertices.emplace_back(1.f, 1.f);
shape1.vertices.emplace_back(0.f, 1.f);

reserve is something that a compiler is very unlikely to ever be able to do for you, as unreserved this code would end up allocating probably 4 elements of space, in up to 3 allocate-operations. reserving the right amount of elements reduces it to 1 allocation of just the right amount of memory (but you should only do that if you do know the amounts of elements). emplace_back will likely not see benefits in an optimize build when a std::pair is used; but could yield benefits with more complex classes (=with a ctor that has side-effects). Also, in debug-builds, you will save a few operations, which could also be valuable.

Can't really say much about the algorithm since I don't know it very well anymore. But that should give you a few pointers.

Figuring out what to do and at the same time aiming for as fast as possible makes the implementation very difficult.

When I implement a new algorithm I aim for correctness first. That is, implement it as specified. Speed is a non-issue at this point. I typically use Python for that. Nicely expressive, good for experimenting if it doesn't work.

Once I got it working correctly, I re-implement in whatever it needs to be in. Correctness has been covered (you know what to code), here you can focus on performance.

This topic is closed to new replies.

Advertisement