How to optimize octree loading time in JavaScript?

Started by
12 comments, last by JoeJ 1 year, 8 months ago

BH

I slightly edited the Octree class taken from the three.js fps example, and it works pretty well for simple meshes.

However I tried making a slightly more advanced house mesh with about 1000 stairs, ramps, and other components, with about 40k verticies total, and when using the same code, it crashes the web page just generating the octree.

I tried making functions to export and import the generated octree, but for even a simpler mesh when stringified it would take up about 230mb of space.

Current working implementation, with source code [with simpler mesh, but same code so same idea would apply].

How can I optimize octree loading?

It should be noted that much of the mesh is symmetric, being generated with mirror and array modifiers in blender, and the actual raw components are only very few, was thinking of only making octree for those few components, then somehow “instancing” the octree based on the mirror and array modifiers, and “transforming” each instance without reducing performance, but not sure if that's practical, or even how to go about it.

The modifiers could hypothetically be read in Javascript with jsBlend, but is there any other way to optimize octree load time for moderate/advanced meshes?

Blessings and Success

Advertisement

BH

@fleabay

Hi thanks for the reply.

How exactly? In the 3D program, or in JavaScript?

It currently consists of a few different meshes grouped together.

Blessings and Success

Awtsmoos said:
It should be noted that much of the mesh is symmetric, being generated with mirror and array modifiers in blender, and the actual raw components are only very few, was thinking of only making octree for those few components, then somehow “instancing” the octree based on the mirror and array modifiers, and “transforming” each instance without reducing performance, but not sure if that's practical, or even how to go about it.

Should be the right direction.

If you (can) have instancing, the usual approach is to build the acceleration structure only once per object. E.g. if you use the same stairs 10 times in the scene, you build one local octree for the stairs model, and then use that same octree 10 times, just like it should be for the mesh data as well.

Then, for the whole scene, you build another global octree over all those instances of your models. So instead building over 10k trinagles, you build only over 50 instances for example.

Notice: This usually requires a transform per instance to support rotation and scaling if the instances. Mirroring the stairs could be implemented as a nonuniform scaling for example.
This means your octrees are no longer aligned to the global coordinate system axis as usual.
If the application is raytracing, one option to do this is to transform the ray into the local space of the model if we traverse such transform node, where it's still axis aligned again.
If the application is range queries where the query region is an axis aligned bounding box, you could either transform the regions corners and make a new AABB from that (which causes the region to grow in size), or you could implement oriented bounding box regions to prevent such growth at the cost of complexity.

One example of this approach is current raytracing APIs for GPUs, where they call this ‘Top Level Acceleration Structure’ for the global scene octree, and ‘Bottom Level Acc.St.’ for the local model octrees.

It also is possible to support nesting of multiple levels, having models which consist of many instances themselves.

Besides saving memory and building times, this also helps with animation, as it is no longer needed to rebuild the acceleration structure. We just animate the transformation nodes instead. That's a huge win in performance and flexibility.

If we use BVH instead octree, we can even support deformations (character skinning) without a need to rebuild the tree each frame. Because BVH is not coupled to a global grid, we can just update its bounding volumes on animation, but keep the topology of the tree as is.

Notice: Once you implement support of such transformation nodes, we loose the octree guarantee of non overlapping bounds. This means we have to change traversal algorithms of a point query to now support the traversal of multiple children, as their bounds can overlap (which also is already true if we use a ‘loose’ octree).

Thus, we can say that in practice BVH often is the better acceleration structure than octree due to flexibility.

B"H

@JoeJ Hi thanks for the response.

Is there any working examples of anything like this online, regarding either BVH or instancing octree? I only have a static house mesh that I want to check collision with a capsule only. Also I want to be able to support a bigger world, is there some kind of approach to this?

I was originally thinking of splitting up the entire world into a 3D array, with each block in the array representing a certain amount of area [similar idea to an octree but different in a few ways], then when the ccharacter moves each frame, the only calculation that's done is to round the position of the movement, and check which section[s] of the 3D array it's colliding with, and perhaps make an octree within each section for the vertices in it and check them, or make nested 3D arrays within each 3D array that would contain some vertices.

But I was thinking that if this is a good idea, it has probably been done before, and the fact that I haven't been able to find anything else about it might imply it's not a good idea, what do you think about it, is it practical, are there any working examples of it etc.?

Blessings and Success

Awtsmoos said:
Is there any working examples of anything like this online, regarding either BVH or instancing octree? I only have a static house mesh that I want to check collision with a capsule only. Also I want to be able to support a bigger world, is there some kind of approach to this? I was originally thinking of splitting up the entire world into a 3D array, with each block in the array representing a certain amount of area [similar idea to an octree but different in a few ways], then when the ccharacter moves each frame, the only calculation that's done is to round the position of the movement, and check which section[s] of the 3D array it's colliding with, and perhaps make an octree within each section for the vertices in it and check them, or make nested 3D arrays within each 3D array that would contain some vertices. But I was thinking that if this is a good idea, it has probably been done before, and the fact that I haven't been able to find anything else about it might imply it's not a good idea, what do you think about it, is it practical, are there any working examples of it etc.?

The best example is physics engines.
I don't know how far we could get using JavaScript due to performance, but maybe there are web physics engines / backends / wrappers you could look up or use.

Regarding your idea of a top level uniform grid, i recently worked on just that.
My world is not primarily made from instances of (small) objects, but from a huge single static mesh. To support streaming open world, i divide the mesh into uniform grid cells.
So i can stream just the tiles around the current player position. And now i wanted to add support for this to the Newton physics engine i'm using.
The problem is that physics needs some adjacency information across the boundary of cells. In my case, Newton needs to know the face normal of the adjacent face of an edge. But this face may be in another cell, so i had to hack the Physics engine a bit to make this work by adding extra boundary face normals.

It worked, and the result is that the Physics engine builds one BVH for each cell, just as it normally does using one BVH per model.
The physics engine also uses some top level acceleration structure, with all those sub BVH models becoming leafs. Just as i had proposed above.
Because my cells form a regular grid, i might work on replacing this default top level acceleration structure with a simple regular grid for better performance in the future.

So yes, i think your ideas make total sense. But it depends.
If your world is small enough you can have all of it in memory, and you would have no instancing of models, no animations or dynamics, then there would be no reason to mix different acceleration structures.
You could just use one single octree for the whole thing. And due to reduced complexity, this simple solution should be faster than mixing regular grids with octrees.
People often think they could get some speedup by using a ‘fast’ regular grid on top, but that's not generally true in my experience.
So the better reasons to use such mixed approaches is streaming, instancing, compression, animation support, etc. Just to make that clear.

I hope you can find some physics engine to help you out, as that's a lot of work and a true performance problem.
Even if you don't need fancy physics simulations, collision detection is reason enough to use one.

B"H

@JoeJ Hi nice to hear back, that's cool that you just worked on a similar approach.

I would very much like to not use a physics engine, as:

I want to understand how everything works, and be able to repeat it and teach it

I want to keep it light weight, currently it works at full fps even on an old android model and I would like to continue that approach

The octree works fine in terms of performance, the problem is just the initial loading.

I haven't fully understood the code yet, but I'm pretty sure it rebuilds at least a large portion of it every time it adds new triangles etc., which maybe is what's slowing it down.

Is there any way to trade off RAM for faster initial computing time?

Blessings and Success

Awtsmoos said:
I want to understand how everything works, and be able to repeat it and teach it

Well, that's a good reason to do it yourself. ; )

Awtsmoos said:
Is there any way to trade off RAM for faster initial computing time?

You mean you want a smaller tree? So downloading it is faster?
Or you want faster build of the tree, because you don't download / precompute at all?

One idea would be to precompute the tree, but do not store bounding boxes with it, which take the most space.
Then calculate the bounding boxes after the download which is much faster than building the whole tree.
(Assuming you use bounding box per node at all.)

Some conventional compression library (zlib etc.) might help as well.

B"H

@JoeJ hi nice to hear back, thanks for the idea

At first I started by trying to export the whole tree as a JSON string, but that took up 230mb for even a basic model, in terms of text. I'm guessing there are other ways of exporting it using bytes to represent the triangles and sub trees to reduce space, but I haven't found any examples, I guess I can experiment with it but I don't have much experience building binary files, especially with nested arrays

I'm pretty sure one has to have split the file up into sections, with each section having header information regarding the section before and after it, or something like that. Even so it would be ideal to just load the 2mb blender file then apply the modifiers with JavaScript using jsblend then compute the octree then, but then we may be back to the earlier discussion regarding instancing multiple octrees based on array and mirror modifiers etc.

Anyways regarding loading it without the boxes would I also export all of the subtrees, but containing only triangles and other subtrees, then essentially run the "addTriangle" method for each triangle when loading? If so I can try it out but I think it may still take lots of time and may or may not crash the page, but here's the code I'm going off of https://github.com/yaakovyitzchak/reebooyaw/blob/main/Octree.js

Blessings and Success

Awtsmoos said:
Even so it would be ideal to just load the 2mb blender file

Oh, if the geometry data is that small, i do not see a need for instancing at all. To make some guess, i'd say the octree should be something like 10 or 20mb at most. Depends ofc, but 230mb seems just way too much!

Using binary files would be the first thing to try. Assuming the worst case, JSON text file uses two bytes per character to support some unicode. A floating point number may need 6 digits, which is 12 bytes vs 4 bytes with a binary file.
But even if we can reduce the size to 230/3, it still feels very large to me.

The next step would be to look at the octree data structure itself.

Awtsmoos said:
here's the code I'm going off of https://github.com/yaakovyitzchak/reebooyaw/blob/main/Octree.js

I'm not good with JS, but i make some assumptions:

The octree adds a triangle to any node it intersects. So we can assume each triangle is maybe in 4 leaf nodes on average. So we duplicate the data 4 times.
I'm not sure if the nodes just store an index to a triangle array (would be good), or if they store actual duplicates of the whole triangle data (e.g. 9 floating point numbers for the vertex points).
If you can answer it's the latter, we have the reason of the big memory requirements. And you should change this to use the former method instead.

The octree stores bounding box per node. But because this is not a loose octree (where each triangle goes into only one node, requiring overlapping bounds), this is not needed.
Instead, you can calculate the bounding box during traversal from the grid coords and tree level.
This adds some complexity to the traversal code and also makes things like raytracing less efficient, but if memory usage is your problem, i'd certainly do this. Because it reduces memory bandwidth, it could be even faster.

It's not clear to me what's the stopping criteria of the octree build process.
There are some options:
Keep subdividing the tree until no node has more than X triangles. (Which might go forever if the geometry has many triangles at a single spot.)
Or, set a maximum of N tree levels in advance, and stop subdivision after that, no matter how many triangles may end up in a single node. (Because your geometry is static, you can tehn find a good memory / performance compromise for N with trial and error.)

It's not clear to me if the octree stores triangles in internal nodes or just in leaf nodes. Explaining those options:
Storing all triangles in leaf nodes is simpler, but it only works well if all triangles are of similar size.
If you have some very large triangles in the scene, a single large triangle would go into a whole lot of leaf nodes, which costs memory and becomes inefficient. Then it's better to put it into internal nodes, because those nodes have much larger cell bounds to solve this problem.

Actually i guess you may end up implementing your own octree data structures, build process and range query / raytracing traversal logic. That's some work, so the above points are probably the main arguments to keep in mind.
It's not easy to find good tutorials on the subject, but very easy to find bad ones.

I just remember this one, which surely is good because the guy is expert and pioneer of realtime raytracing: https://jacco.ompf2.com/2022/04/13/how-to-build-a-bvh-part-1-basics/
It's about BVH, dynamic senes, and surely overkill for your needs, but may be worth reading to grab some general concepts.

This topic is closed to new replies.

Advertisement