What kind of coding standard do you prefer

Ivica Kolic · 2022-12-20T20:57:39

So, other than the official answer “what ever standard client prescribes”, what kind of coding standard do you use at home in your personal projects?Here is mine:Standard is optimized for speed, simplicity and readability and fits on one page. To get the best cache performance it is wise to keep your memory sequential and compact as possible (DOP instead of OOP at least at the lower levels). ECS is main paradigm.using vectors instead of object pools. I want sequential and pre-allocated memory and even if memory isn't preallocated (empty vector) it will become so after a few frames (when enough objects get created). Maybe add another vector to track free elements (or wrap those two ones in a template called ConservativeVector - but that is practically object pool if you preallocate enough memory). Be careful with references. If you have a lot of dynamic object this might not be the best solution.using indexes instead of pointers (this goes together with a point above). In most cases unsigned short will be enough and it will be 4 times smaller than pointer so your data will fit in cache better. This also means that you absolutely don't have to write copy constructors and have any kind of memory management (or very little) - literally entire engines and editors can be written without a single new, delete, malloc and free command (at least for static data like meshes and textures). Another benefit is the code can be run on GPU and CPU all the same (if you wrap recursion). I've implemented ray-tracing octrees this way and did all the debugging on the CPU. My BVH hierarchy copies to GPU the same way. If you have some extremely complex hierarchy, debugging with indices is much easier than reading pointers (easier to draw entire hierarchy on paper and setting break points).using Memory Context class to store all those vectors that would otherwise be created as a local variables in the functions. Of course, these memoryContext objects should be reused (and passed by reference) so memory allocations won't happen on every use but only on first few usages. If you have a thread pool, have one of these objects created for every thread. Be careful with recursive functions.don't use sets or lists - they are 10-20 times slower than vectors. Migrate all your helper libraries to use vectors for internal storage (RadixTries, PatriciaTries, Dictionaries, Octrees, Radix Sort functions, geometry processing functions…)SortForEfficiency is a common function in my helper libraries.Always pass by reference. Don't return vectors, but pass them as input-output parameter. That way memory pre-allocation will be maintained.use JSON for storage and clipboard copy paste. It is simple and it will maintain compatibility within different versions of application and anyone can code plugins in any language they want. Nobody has to waste time with writing custom DSL parser every time. If you are very lazy, you can even just dump those vectors in a file and hope that format won't change but I wouldn't rely on that.greedy thread pool. If it isn't faster than OpenMP you haven't created a fast thread pool. In all my use cases I only needed a thread pool where all the threads run the same task so I've actually omitted single threaded type of task in the latest implementation. It is kind of like async but more. There are 3 levels of task priority and only first one is non blocking. If bottleneck happens all the free threads can help the ones that are still working by preforming sub tasks (there are macros that I use in all of my code that will do parallel processing if thread pool is present or just single core if not). One thing, if some thread submits a task to queue and manages to finish it before anyone helped - that is OK - there won't be any penalty like with OpenMp.I rarely use inheritance and virtual functions but good example of it is the GUI system (it is highly efficient and packs one integer per graphical element instead of entire geometry of 4 vertices which is 28 times bigger like with some other libraries). So, in some cases OOD is more elegant than DOD but I mostly prefer DOD. OOP example is using virtual functions for OnCreate and OnUpdate on your engine's objects. On the other hand, if you use DOD and composition, these function can just be properties of the object. If they are set then call them, otherwise do the default functionality.not using single file per class. I like to keep things that make sense together grouped. I also prefer header only libraries for smaller things that won't change much in the project (many times these classes are templated so there isn't much choice). Bigger classes that are updated frequently should have both H and CPP files.That is it. Almost no memory management and very fast code (for certain types of project where most of the objects is mostly static).Edit: edited for clarity.

frob

46,223

December 16, 2022 07:01 PM

Learn your tools and choose the best tools for the task.

For containers, a simple linear array is the most common use pattern so it is a good default choice. Except when it isn't. Experience can tell you to use something else, and barring that, your profiler will tell you.

Memory pools, small object allocators, and custom heaps are great tools when you need them, but unnecessary costs when you don't. Experience can guide you, as can your profiler.

Be willing to change implementation details after getting hard data about performance. Nobody picks the right tool 100% of the time. Measure early, measure often, monitor trends.

For code style, do not fight your tools. If your editor defaults to 4 spaces or 3 spaces or 8 spaces, go with it. If your editor wants to change spacing or layout, go with it. If a library uses a different style, use it there. If a tool generates code that follows a different style, use it in those files. “When in Rome, do as the Romans.”

Sometimes there is no right decision, each option has tradeoffs at it comes down to personal preference. In that case choose what feels best at the time and go with it.

Be flexible and learn to bend along with the wind. Those who are inflexible will break.

JoeJ

4,185

December 16, 2022 07:11 PM

Juliean said:
As for object-pools - I did implement pooling recently for my ECS, but not using lists or anything. The objects are still there in memory, only being treated as-if deactived. For scenarios with lots of spawned objects that have to be destroyed often, this saves a lot of overhead from having to manipulate the vectors all the time, but it still means that when accessing entities/components, its not based on a list. If I made all those things lists, then the performance would degenerate in pretty much every scenario. The way it is, pooling can decrease overhead in certain scenarios by 2-3x (even though the objects are not fully removed and “holes” in the memory layout could appear), but when not used the performance is still optimal for iteration.

I'm not sure if i properly understand the terms (pools, freelist) in a common sense, but i guess object pool means preallocated memory to hold X objects (each having the same size). To manage the allocations, you need a list. But it's not necessarily used to iterate the objects. For iteration i mostly have to use some spatial structure (tree, grid), which then holds indices / pointers to the objects.
In such scenario pretty anything is random access. Pointer chasing while traversing the tree, then iterating objects of a node which may be in random memory locations.
And to improve this, i can use larger branching factors to reduce tree depth and have pointers in continuous memory, and i can reorder to objects by tree order, so they are sequentially too. (To stick at the given examples from above)

Now the interesting thing here is: Although that's every day work for most stuff that i do, i have not found a way to help myself reusing code.
Always i need to make such systems almost from scratch, because subtle differences on the actual problem lead to very specific solutions. It's always kind of the same concepts which give a solution, but the implementations differ.

I guess many programmers could mention something similar, so i begin to question if maybe our workflow with programming is inefficient and could be improved. C++XX is not enough, experience is not enough, so maybe a revolution is to be expected.

Did you see the post about the AI Bot to help with game design? Try it out, and also look at my response about natural language to code AI. This already works pretty well. They guy who showed it to me has tried it. He asked for a box collision detection code. He did not only get cleanly written code which works, it was also well documented. Plus some extra explanation.
That's really interesting. I'm not easily hyped by ML progress, but i would not wonder if this might affect us earlier than expected. Artists already complain about AI taking their art without permission to do their job, maybe we are next.
If so, i'm not sure if that's going to be more curse or blessing. The thought kinds sticks with me…

Juliean said:
a list (which is really bad when you are concerned about dynamic allocations because every new node will be its own allocation)

Our interpretation differs: To a list just means having a pointer to the next object, but does not include the object allocation topic.
That's why brought up the pool example, where the list lives in a fixed sized array.

Coming back to the vector, leaving in deleted objects, we could use a skip pointer per object to get to the next active object.
Then we get mostly sequential iteration of memory, with the exception of skipping over unused space, which is good. But again i would say that's a list then.

But i agree std::list as a container is a no go. I realize that's what you wanted to hear, i guess : )

Juliean said:
Even for Octrees, I'm not convinced that they are really a good solution all the time, or as often as people use them.

You are right. And that's your problem. : )
Because you're a game dev, you have learned that iterating stuff in linear fashion is mostly faster than fancy trees.
And then you stick at that. Using brute force all the time. You even became expert in making brute force as fast as possible with low level optimizations, and you give talks at Siggraph about your latest and greatest brute force pixel shader.
Everybody listens, and continues the tradition…

But. You guys dis forget about something.
The reason why brute force works well for games is this:
The work efficient O(log N) algorithm has higher complexity, and this complexity generates some base cost. The simple brute force O(N) algorithm has no such cost.
Thus, the brute force algorithm wins as long as the problem is small enough.

But it's 2022. Jensen sells 80 teraflop GPUs for 2000$. Lisa and Pat sell CPUs with 16 cores.

Do you really think that our problems are still that small that they are below this ‘small workload’ threshold?

Or might it eventually be that it's not the case anymore, since years already. And your failure to change outdated brute force habits is the damn primary reason why the hell we have 2000$ GPUs now, killing our business?

hihihihi… sorry. But to me it really looks it's a yes to the latter. ;D

Juliean said:
Even if removal in a list is technical O(1), you still have to find the node. And that is again O(N)

Maybe you'd like double linked lists better. : )

frob

46,223

December 16, 2022 08:40 PM

JoeJ said:
I'm not sure if i properly understand the terms (pools, freelist) in a common sense, but i guess object pool means

This looks like a great growth area for you. There are quite a few useful and interesting memory management structures and tools out there to explore. You might start with the various types of pools offered in the Boost.Pool library as a jumping-off point.

JoeJ said:
i have not found a way to help myself reusing code. Always i need to make such systems almost from scratch, because subtle differences on the actual problem lead to very specific solutions.

Seems like another wonderful potential growth area for you. A sign of mature software engineering is that you can create general-use solutions that satisfy specific needs. There are several behavioral patterns that let you provide a general framework with ways to specify a specific internal behavior. A “Strategy” design pattern is the classic GoF pattern, but really anything that lets you adapt the details can work.

JoeJ said:
Our interpretation differs: To a list just means having a pointer to the next object, but does not include the object allocation topic.

A C++ list means something very specific. It is covered in C++20's standard from pages 828 to 842, 14 pages with all kinds of performance and allocations specifications, plus more definitions as it is referenced in a few other places such as general container requirements and sequential container requirements. Much of your “interpretation” is non-complaint.

JoeJ said:
But. You guys dis forget about something. The reason why brute force works well for games is this: … Do you really think that our problems are still that small that they are below this ‘small workload’ threshold? … And your failure to change outdated brute force habits is the damn primary reason why the hell we have 2000$ GPUs now, killing our business?

Please avoid strawman arguments.

While what you wrote is true for a tiny subset of situations, it is broadly false on it's face. The assertions give a gross misunderstanding about what graphics cards are doing and what processing is taking place, or at least a gross misstatement of fact.

JoeJ

4,185

December 16, 2022 09:56 PM

frob said:
A C++ list means something very specific. It is covered in C++20's standard from pages 828 to 842, 14 pages with all kinds of performance and allocations specifications, plus more definitions as it is referenced in a few other places. It is not subject to much interpretation.

But the topic was lists as a generic data structure, not C++ STL implementation of it as a container. At least i did not notice std::list was meant.

frob said:
Please avoid strawman arguments. While what you wrote is true for a tiny subset of situations, it is broadly false on it's face.

Somehow i have expected you would feel triggered from my disrespect, since you're industry veteran at senior level. However, my critique is focused on graphics in games, no other fields.

frob said:
reveals a gross misunderstanding about what graphics cards are doing, or at least a gross misstatement of fact.

I know pretty well what graphics cards can do. Currently most GPU work is spent on lighting, and that's exactly where my critique comes from: I have never seen any lighting solution in games which is not brute force.

Currently there is a lot of progress on realtime GI, but the cost is much too high. I expect many will quit PC gaming because HW is so expensive, especially GPUs. Don't you share those worries to some degree?
I mean, Jensen himself declared Moors Law being dead, which basically means: To push graphics further, the next generation of GPUs will be even more expensive. How does the games industry intend to deal with this?
The obvious solution seems to be: Accept HW stagnation, keep minimum specs low. Instead relying on progress for free from faster HW, work on better software, which can only mean to finally explore non brute force solutions for realtime lighting. I know this exists, since it's my primary work. SW progress is slower than HW progress, but it's still progress, which is all we need.

Though, that's not what i see happening. It's the opposite: Epic says >50% of next gen games will use UE5, which has an awesome solution to the geometry LOD problem, but negates the benefit with nice but costly brute force lighting. So min specs are high, forcing people to upgrade. On the other side i see NV seriously proposing Path Tracing to be the one and only solution and next big thing, which needs even more HW power, forcing people to upgrade as well.

Now you might say it always was like this. People always had to upgrade after some time. It's how our business works.

But then i argue this was fine when Moores Law was still alive, and upgrades did always cost about the same amount of money. That's over now. 4080 costs twice as much as a 3080 did, here in Europe at least. Too costly for the majority.
At this point i do not even need to discuss what graphics cards are doing technically. Looking at the economic aspect alone is enough to criticize this industry, and questioning their desperate desire on better gfx no matter what's the cost, while still failing to come up with efficient solutions.

That said, apologizes for my constant sarcasm and tech depression. I can see that's annoying, but i can't help it.
I claim to have a solution for the lighting problem, so i'll try to prove that asap, maybe it helps. I hope this industry is still there till then… <:/

Juliean

7,344

December 16, 2022 10:29 PM

JoeJ said:
But the topic was lists as a generic data structure, not C++ STL implementation of it as a container. At least i did not notice std::list was meant.

Well, C# List definately doesn't qualify as what we are talking about as “list”, because its pretty much just a vector :D I'm not sure how many very different implementations of list actually exists, but I do think that in order to achieve the “desirable” properties of list (and I put that under as many quotes as you can imagine), you'd have to use something similar to std::list. In order to delete in O(1), every node has to be its own allocation, discounting building a list in one go with an allocator, because you know, then the whole removing things from the list thing is not really a thing anymore.

JoeJ said:
And then you stick at that. Using brute force all the time. You even became expert in making brute force as fast as possible with low level optimizations, and you give talks at Siggraph about your latest and greatest brute force pixel shader.

Thats not really true at all. The reason why “brute force” works well is because it supports how modern (CPU) hardware is built. There is nothing to do to make linear algorithms more efficient. Simply:

std::find_if(vector, [identity](Element& element) { return element.IsWhatIWant(identity)));

is way more efficient than

std::binary_search(vector, [](Element& element) { return element.IsWhatIWant(identity)});

(pseudocode, obviously) Until you get into the thousands of thousands of elements. What am I exactly hardcore optimizing, by selecting the proper algorithm for the problem…?

Thats by the way all I'm advocating for. Look at the problem. I had a boss once that had this mindset, O-notation always wins. Have 1 element? Use O(nlogn) algorithm, because you now someday you could have 20000000 elements - even if the possible set of elements was “letters from A to Z”. And no, I'm not really exagerating, maybe a tiny bit.

I also agree with you - if you have a problem that has the number of elements where it pays of, by all means use the more complicated, more O-notation friendly (sry for a lack of a better word) approach. But if you do have that case, we shouldn't really be having this discussion, should we? Because I did already say:

Obviously, I think that for systems with many millions of elements like voxels you'll need those structures

So if you are working on a problem which requires O(log n) algorithms/data-structures, nothing that I said really applies to your situation. I do also optimize things if requirements come up - for example, my script-backend started up as a clusterfuck where every operation had a tremedous overhead. As I got more and more scripts, and introduced the aformentioned turbo-mode, I needed to optimize it tremendously, in a fashion that could be described similar to how you would optimize brute-force collision detection to using a more optimized structure - maybe even more hardcore, to be fair, going from “code is classes” to “bytecode” to “JIT” was a ginormous endevours.

But I'd digress. I don't think we really actually disagree on so many points as we maybe misunderstood each other? I do think you should find it agreable that you should take decisions between “simple approach” and “complex, better O-complexity approach" based on what dataset you are working with, wouldn't you? Maybe after all we disagree on what a list is, or how bad (linked) lists really are IMHO :D

JoeJ

4,185

December 16, 2022 11:59 PM

Juliean said:
I'm not sure how many very different implementations of list actually exists, but I do think that in order to achieve the “desirable” properties of list (and I put that under as many quotes as you can imagine), you'd have to use something similar to std::list. In order to delete in O(1), every node has to be its own allocation

Now i did look up definition on wikipedia:

In computer science, a linked list is a linear collection of data elements whose order is not given by their physical placement in memory. Instead, each element points to the next. It is a data structure consisting of a collection of nodes which together represent a sequence. In its most basic form, each node contains: data, and a reference (in other words, a link) to the next node in the sequence. This structure allows for efficient insertion or removal of elements from any position in the sequence during iteration. More complex variants add additional links, allowing more efficient insertion or removal of nodes at arbitrary positions. A drawback of linked lists is that access time is linear (and difficult to pipeline). Faster access, such as random access, is not feasible. Arrays have better cache locality compared to linked lists.

I agree which this definition. It does not mention allocation. Further there is even this:

The principal benefit of a linked list over a conventional array is that the list elements can be easily inserted or removed without reallocation …

So reallocation is not needed to use the concept of a list.

But because the concept is so simple, there is no point to add such list data structure to stl. While having a container which works like a list does make sense.
Thus the misunderstanding, but i'm not wrong with saying lists and containers are different concepts.

I make a practical example of a ‘good' list:

We have a regular grid for collision detection and many objects, each smaller than a grid cell.
One object thus may cover 4 grid cells at most. If we insert the object into each of them, each cell may require a std::vector to store a unbound number of objects, which sucks.

Instead, we put each object only into one grid cell where the difference between centers is the smallest. To find all collisions for one object, we now need to iterate it's current cell, but also 3 neighbor cells. Same thing we would do for a quad tree. That's much better, because we no longer need a std::vector or something similar.

To put multiple objects into a single cell, we use a list.
Each object has a next pointer, and each cell has one pointer to the first object of the list.

To build the grid acceleration structure, we rebuild it each frame in O(n): For each object, we link its next pointer to the former object in the cell, and we link the cell pointer to the current object.
To iterate objects of a cell, we start at the cell pointer and follow the object pointers.

That's the most trivial example of using a list i can currently think of. It's good, and there is no allocation going on. Number of objects is constant, number of grid cells too.
Do you agree there's nothing wrong here, and that this is an example of using a list?
If so, that's why i think you should not say ‘lists suck in general’. And for similar reasons you shouldn't say octrees are not worth it. It always depends. But Newcomers pick it up, and if they hear it a thousand times, they won't question it. They will never try to use an octree. And then… 2000$ GPUs… the downfall of gaming… game over. ; )

Juliean said:
(pseudocode, obviously) Until you get into the thousands of thousands of elements.

Which is exactly what i said, with the addition that we often deal with millions of elements in games now.
I think it's like that:

Good advice in 2000: Stop! Did you even try to just iterate? It's way faster for your little number of 100 objects. So scrap that complex binary search code! It's because modern CPUs… blah… cache lines… blah…

Good advise in 2020: Um… uh… I'm sorry, dude. But for your 100000 objects, maybe using a binary search wasn't such a bad idea. Let's dig out that old CS book to look up how it worked…

Juliean said:
I'm not really exagerating, maybe a tiny bit.

That goes both ways.
Ofc. i do not talk about simple problems like searching a letter in a string. I talk about open problems still in front of us. Those problems have no simple solution, otherwise we would have found it already.
Thus i'm convinced complexity is the most promising path to achieve further progress. Times where doubling fill rate enabled us to do new things are over.

Take Nanite for example. It's complex, uses trees, emulates HW accelerated features in software. That's all the things people told me to be pointless back then. (I'm happy i did not believe them.)
And see what it gives: New visuals and progress, much more impressive than halfing framerate with RT reflections, imo.

That's the spirit, and what i personally expect from game developers. It's about tackling open problems and solving them, which won't be as simple as the things we already did before, but keeps our medium evolving.

Juliean said:
we shouldn't really be having this discussion, should we?

Ofc. We both know the other already knows which tools to use for what. We won't teach us anything, nor do we argue just to have fun.

Tbh, i think the reason we argue still is that we all know that we're fucked, and the gaming ship goes downhill. But maybe (or likely) that's just me.

Juliean said:
I don't think we really actually disagree on so many points as we maybe misunderstood each other?

I never thought we would disagree at all.

It's my pessimism, and i always had problems with people not spotting my sarcasm, which is my way to deal with the pessimism by turning it into some fun. My fault. Sorry for that, guys.

I have some optimism too. But i lack the proper middle ground i guess.

Juliean

7,344

December 17, 2022 12:47 AM

JoeJ said:
Which is exactly what i said, with the addition that we often deal with millions of elements in games now.

But not in all games though. So it still depends on what game you are working on. I would not say that may 14h story-based 2D RPG is an extermely simplistic game for example (you'll just have to take my word for that of curse), and yet as I think I have made a good point of, applying a lot of those “optimizations” that make sense for games with larger object-counts wouldn't. I guess thats the question then - when and how do you decide what you do? If you do work on a game with max 100 collidable elements at the same time and you treat it as if you had 1mio elements, than you are wasting your time. And if you treat a game with 1 mio elements as if you had 100, than it probably runs as fast as Unity does. Of course you have to account for things that you can very likely predict, but I don't think you made a mistake if you design your systems for a certain requirement, and then the requirement changes. Then you simply have to refactor and apply a more appropriate solution. But doing that beforehand would probably be the posterchild for premature optimization, in my book ?

JoeJ said:
That's the spirit, and what i personally expect from game developers. It's about tackling open problems and solving them, which won't be as simple as the things we already did before, but keeps our medium evolving.

Which I totally agree with. And I do that, too. As I already pointed out, I do optimize things as they pop up, all the time. For example, the (debug) loadtime of my project become too long for my liking, ~8s, so I started implementing optimizations to bring it down to something more acceptable. But again, thats strictly based on a measured performance-metric. Which is what I belive you should use as a basic every time when you optimize anything.

JoeJ said:
That's the most trivial example of using a list i can currently think of. It's good, and there is no allocation going on. Number of objects is constant, number of grid cells too. Do you agree there's nothing wrong here, and that this is an example of using a list?

I cannot really agree that this is an unanimously good use - sorry to say. But it does sound like its a list, at least :D
It does have exactly the same issue that normal lists have, namely, that iteration-overhead is going to be pretty bad. Now, it will outperform O(n^2) brute-force solutions at a reasonable number of elements, which I'm just going to assume you have. Though I'm still not convinced that its the best possible solution.

I'm a bit tired, so I'm just going to throw some findings that I have about the problem you describe:

I really don't find separate std::vectors per cell to be worse than using lists. The iteration-speed is definately going to be faster. And allocation during frames could be kept pretty low by just using the internal capacity-mechanism of the vector, if the cells themselfs stay constant. If they don't, you could still use move-semantics to perserve the vectors themselfs when the whole nodes are destroyed, and unpool them. I use something similar for the coroutine-stacks that need to be maintained for my script-backend, after the initial load has been reached, no allocations have to be made. And you would have to allocate the list-nodes once anyway, so doesn't seem like its much worse
If thats really out of the question for a reason that my sleepy brain cannot understand, then at least the fact that you rebuild the entire tree each frame screams to me that you would be able to compact the “list” or “vectors” from each node into one linear area of memory, and in the tree-nodes itself only store a pointer into that area and the count. Then iteration is linear again, and everything is fine. You might need to move some elements around and find things, because you do not know the size of how many elements are in each cell, which could perform overall worse for very very large Ns - though this also depends again on how often you iterate over the final structure vs how often you build it.
Ok, you have that very large N of objects, so you really need that O(1) speed for building the list. If you are not highly memory constraint, then you could simply allocate a large area of memory and give each node a potential area of elements, based on some heuristics of the amount of elements an average cell might take (or, if you are really not memory constraint, how much the maximum a cell could have). Then filling in cell-elements is still the same complexity as it was for the list, but you get the advantages of linear memory. Yes, you'll have holes in the memory, but thats not the main issue. As long as all elements inside a cell are next to each other, and the code for iterating them is a simple loop and not a jump between nodes, its going to be better.

Maybe nothing of this can really be done in your example for one reason or another, but you did want programmers to tackel open problems and solve them, sooo ? And just to reiterate, the main problem that I personally see is that with every linked-list based solution, you are paying a large cost for iteration. So if there is any way to keep the same time-complexity, even if amoritzed, but use continous memory, it is very likely to be faster - but would need to be measured of course too ?

JoeJ said:
Take Nanite for example. It's complex, uses trees, emulates HW accelerated features in software. That's all the things people told me to be pointless back then. (I'm happy i did not believe them.) And see what it gives: New visuals and progress, much more impressive than halfing framerate with RT reflections, imo.

Those things might have been pointless unless something like nanite came up. You don't really need a steering wheel without a car, know what I'm saying? Nanite from my limited understand is a very complex system that does achieve some impressive things, and might not be possible without the features that you describe. But its also probably more than the sum of the parts, and not a reason to now use trees and stuff everywhere when you don't need to ?

JoeJ said:
Tbh, i think the reason we argue still is that we all know that we're fucked, and the gaming ship goes downhill. But maybe (or likely) that's just me.

I don't know about that. Personally, I have fun in my “writing own engine and game that I don't own the IP to” land :D I do find many things about the larger gaming landescape disturbing too, like how a shitpile like Unity got so much traction, and how horribly (and IMHO unnecessarily) bad performance-wise Blueprints in an Engine I otherwise enjoy (Unreal) ist. But I don't see things as pessimistic, I guess.

JoeJ said:
Ofc. We both know the other already knows which tools to use for what. We won't teach us anything, nor do we argue just to have fun.

Well, somebody I knew once said that you don't have to really talk about things that you already agree on (or discuss; not sure if the saying translates that well into english). So I do find “arguing” like that health and vital, to a certain extent ?

Ivica Kolic

104

Author

December 17, 2022 08:29 AM

Juliean said:
JoeJ said:
Take Nanite for example. It's complex, uses trees, emulates HW accelerated features in software. That's all the things people told me to be pointless back then. (I'm happy i did not believe them.) And see what it gives: New visuals and progress, much more impressive than halfing framerate with RT reflections, imo.
I don't know about that. Personally, I have fun in my “writing own engine and game that I don't own the IP to” land :D I do find many things about the larger gaming landescape disturbing too, like how a shitpile like Unity got so much traction, and how horribly (and IMHO unnecessarily) bad performance-wise Blueprints in an Engine I otherwise enjoy (Unreal) ist. But I don't see things as pessimistic, I guess.

Guys, large scale complex environment AI generation is coming soon. We are all screwed including UE and Unity. You'll be able to generate fully custom GTA6 world in the matter of days and it will fit in less than a megabyte which is ridiculous since GTA6 is a 2 billion dolar game.

Blender and UE just introduced procedural modelling and it is already somehow obsolete. Nanite and chunk map loading that UE 5 introduced will also be of less use since API takes care of generating needed geometry on the fly. Lumen can't work well with dynamically created geometry - you should split it in small chunks for which you must generate SDF which takes too long so it won't be practical. If you have HW raytracing Lumen is OK but that won't work on laptops with integrated graphics, smartphones or AR glasses. So pretty much everything cool that UE5 introduced is not compatible with what is coming which is a shame because UE5 is a really cool tech.

This is why we must make simple GI solutions that work with laptops with integrated graphics. Lumen is just nvidia's VXGI with SDF ray tracing instead of voxel cone tracing but I expect that will be returning soon (nvidia deprecated it because they wanted to sell raytracing hardware). We must also make physics engines compatible with dynamically generated geometry, and all the support APIs like that. That is all we can do to prepare. I'm sure UE and Unity will do the same - they are big companies with a lot of smart engineers but the landscape is changing rapidly.

a light breeze

358

December 17, 2022 10:14 AM

My current approach to object storage is to use a std::deque, and to mark objects as dead (and reusing the storage on the next allocation) instead of actually deleting them from the container. Advantages:

Object indices are stable. If a given object is at index 5, it will always be at index 5, even if it is the only object remaining in the container.
Object pointers are stable. Given a pointer to an object, it will point to that object for that object's lifetime.
Resizing the container does not copy/move any objects.
Objects are not required to support copy/move operations.
Cache performance is similar to std::vector.

Disadvantages:

Iteration is more complex and therefore slower, since I need to skip over dead objects.
Lookup/iteration is more complex and therefore slower due to the extra level of indirection in std::deque over std::vector.

JoeJ

4,185

December 17, 2022 10:38 AM

Juliean said:
I cannot really agree that this is an unanimously good use - sorry to say. But it does sound like its a list, at least :D It does have exactly the same issue that normal lists have, namely, that iteration-overhead is going to be pretty bad. Now, it will outperform O(n^2) brute-force solutions at a reasonable number of elements, which I'm just going to assume you have. Though I'm still not convinced that its the best possible solution.

I didn't say it's an optimal solution, just a trivial example of putting a list to gut use.

But i wouldn't say iteration overhead is an issue. Complexity on iteration is minimal. Following a pointer to the next object is O(1) just as much as increasing your index when iteration an array sequentially is O(1).

The random memory access isn't an issue either, because you will only iterate a small number of objects in practice.
If you go brute force, testing each vs. each other object in O(n^2), random access would be an issue ofc., but that's no more argument, now that we have an acceleration structure to prevent large iteration counts.

So you have to agree the example demonstrates good use. You have to, because i did not specify how large N is. Thus, big O notation applies.
In questions and discussions brought up here, N mostly isn't specified. In those cases the general advise should be towards a solution of better time complexity, not a solution of maximized simplicity, imo.
To propose a brute force solution, we have to build up additional context introducing such specification of N. Which you and others do, but often only after guy like me complains.
So i observe a culture of brute force mentality in game development. And it becomes worse, now that most functions requiring some serious engineering are abstracted away in U engines or libraries.
This goes so far that game developers degrade to content creators. And i don't think this does our medium a favor, where pushing tech always was a dominating factor and road to success.
If we loose this factor of tech progress, we will see a stagnation in games, including game design. One game will be like the other. And we will see the same downfall which happened to the music industry: Out of ideas, selling out by streaming services, game over.

We depend on this tech progress. We must keep it upright, no matter what's the cost. But we can't move the cost to our customers, letting them pay for unaffordable HW. It won't work.
So we have to move the progress from the HW side to the SW side. It's the only way i personally see.

You see my motivation to bring those things up again and again has nothing to do with programming topics. I try to discuss the growing disappointment about games, and what we could do about it.
But for reasons i do not really understand, rarely somebody responds. It feels like the industry is ignoring the issue. So i become more aggressive and insist, but well - it doesn't help. Nobody is willing to talk about it.
Watching game awards but ignoring the iceberg in front…

However, please notice i do not criticize your opinion or work. You make a 2D RPG which does not need pushing tech. Brute force collision detection is fast enough, you do not even need multi threading.
That's fine. I'm not disappointed because you don't offer photorealism, massive destruction, or insane details, etc. I don't think any game should have this, just some of them.
Impressive tech is not what makes a good game. I only think we need this constant progress so the medium doesn't get stuck on the long run.
But currently, in a time of global crisis, keeping min specs low should be the priority i think. Your 2D RPG feels right, Portal RTX feels wrong.

Juliean said:
I really don't find separate std::vectors per cell to be worse than using lists.

That's interesting. I'e tried std::list just once. I saw it's very slow, and never considered it again.
But vector<vector<>> is even worse. Much worse. Just, it's hard to avoid it in cases, so i have to use it sometimes for my tools. But it's a real performance killer. Often i got better results with using some map.

Juliean said:
the fact that you rebuild the entire tree each frame screams to me that you would be able to compact the “list” or “vectors” from each node into one linear area of memory

Yeah, that's towards a ‘optimal solution’, as you said before. But that's much too complicated for a minimal example about using a list efficiently.
Btw, the real reason of complexity here is multi threading. I have such efficient sparse grid for my fluid simulator, for example. Updating the grid, so most processing remains lock free, becomes the major challenge.
That's also the reason i can not create reusable code for such things. I have building blocks such as pools for sparse blocks, traversal of the sparse grid, etc. (I even used inheritance for those things, and templates! \:D/ You would like it.)
But it's just a little bit of code. Trivial stuff, not really reducing the work if i'll reuse the sparse grid for other things in the future.
Updating the grid is currently too specific to the fluid simulator to make it general functionality. It depends on assumptions, e.g. that many particles are in a cell, and that only a small number of them moves to another sparse block in one step.

Maybe, if i could get 200 years old, i would become good enough to turn such things into a good, reusable code base. Or if i would spend years on just such specific problems.
But currently i'm faster by doing things from scratch and specific to current needs. I feel incompetent, but it works. : )

Coming back to the grid, it's interesting that the final reordering of particles in memory so they are sequentially ordered to a grid cell, takes the same time than all the complex stuff (updating octree for the sparse blocks, binning particles to blocks and cells, etc.) The reordering is just a simple copy operation per particle. It's bandwidth limited, thus multi threading barely helps to make it faster.

The interesting observation here is this: For a real time fluid sim, you would not use a sparse grid. You would use a simple regular grid, causing the restriction of ‘Yes, you can do realtime 3D fluid sim now! But only in a small box. So you need to design around this limitation.’ That's the current state of the art. Regular grid avoids all the complexity and cost of a sparse grid.
But the reordering is needed no matter what, otherwise your performance will drastically decrease over time.
Thus, if the sparse update only has the same cost as the reordering step, we could actually consider to use it for realtime too, lifting the box limitation, and enabling large scale fluid simulations.
So that's an area where i expect some progress happening in the next years. Which is then an example where replacing brute force with complexity will give us new things.

Juliean said:
So if there is any way to keep the same time-complexity, even if amoritzed, but use continous memory, it is very likely to be faster

That's what i meant when talking about ‘tuning for granularity’ (lacking a better word).
Basically, to beat brute force with complex solutions in realtime, we want to generate small batches of brute force work on the lowest level. So we still have advantages like continuous memory, low divergence, etc., but we also manage to reduce the work.
That's a general pattern i see. Better time complexity is still the highest goal, but we need to respect how modern HW works on the low level. HW wants brute force.
So we seek for the best compromise, which works, but ofc. N still needs to be large enough to be worth it.

Juliean said:
Nanite from my limited understand is a very complex system that does achieve some impressive things, and might not be possible without the features that you describe. But its also probably more than the sum of the parts, and not a reason to now use trees and stuff everywhere when you don't need to

But you don't know that you might need Nanite before you have it. Only after you took the risk to invest serious effort to solve an open problem, and then it actually works after a path of failures, only after that you see what it gives.

So what i miss from the industry is the will to invest into risky research.
Maybe it's indeed a good thing Epic became such a tech behemoth, so they can do such investments. Not sure.

Regarding trees, that's just a main subject for me personally. There are sure other promising things worth to explore. But trees rock, and i love them. So you can't talk down on precious trees without expecting my intervention. :D

Juliean said:
I do find many things about the larger gaming landescape disturbing too, like how a shitpile like Unity got so much traction, and how horribly (and IMHO unnecessarily) bad performance-wise Blueprints in an Engine I otherwise enjoy (Unreal) ist. But I don't see things as pessimistic, I guess.

Ahhh… thanks. : ) I feel better now.

What kind of coding standard do you prefer

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

What kind of coding standard do you prefer

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines