Procedural Landscape - Efficient Runtime Generation - Graphics and GPU Programming

0

Author

September 05, 2023 10:45 AM

I am a hobbiest developer and am doing a procedurally generated landscape. I want to share my general approach to see if it makes sense and to get any ideas on different ways to approach it.

My current code does the following;

I store heights at a frequency of 50m in a set of height files, then at runtime load them up and do a interpolation and inject some noise over the top to create a more naturlistic look. This gives me a detailed heightfield at any specific tile size.

Alongside the low frequency height map I generate a texture coverage map at the same resolution based on noise plus biome information, plus the heighmap values (snow line. scarp etc). Again when I load this I use the probabilities held in the coverage map and resolve those into fractal parameters which give me a deailed coverage value at any specific tile size.

Models on the landscape are provided with height positions at runtime via interpreting the detailed heightfield generated above. Models themselves (such as houses) can impact the heightmap, as can roads, rivers and other linear structures - and I generate a 'deformation map' overlay for both height and coverage, per tile at runtime and apply those to the first two steps to generate a finalized height and coverage set of data.

My tiles are rendered from a fixed set of vertexes, with height coordinate picked up from the generated heightmap, and textured based on the coverage map.

Each frame has around 120 tiles visible, with a pretty consitent triangle-to-pixel coverage depending on the view distance. This set of visible tiles is calculated on a background thread, and any missing tiles are picked up into a background “tile construction queue”

Each frame may do two jobs - one rendering the existing landcape, and the second (depending on frame budget remaining) to pick up the construction of tiles from the ‘tile construction queue’ . I can 'afford' about 5 tile construction jobs per frame to keep around 40fps.

I dont read back data to the CPU at runtime in general, although I do need to for one specific job where I generate random numbers of foliage models on a tile - but I push that to a frame-delay job and let multiple frames pass before reading it back to avoid GPU stall.

My inability to construct more than 5 tiles per frame means that when the viewer hops into a new area it takes many frames to finally load an acceptible detail for them to view - until that point they are presented with low resolution tiles - it all looks a bit janky - nowhere near smooth and not at all convincing.

Other than shunting a lot of the runtime construction of heightfields, coverage, deformation into a pre-stored set of (extremely large) data tiles, is there anything about this approach that can be improved ?

Is doing procedural landscape tile-by-tile still a good approach, or do people use some other approach for this now ?

Aressera

3,145

September 05, 2023 03:38 PM

You might find some value in reading my blog posts (1st and 2nd) on this topic. I've made much progress since those posts, but don't have any good screenshots. Everything in my system is 100% procedural, there is nothing stored on disk aside from textures.

The thing that has taken me the longest (8 months) to figure out is how to texture the terrain efficiently with many materials (hundreds). I use PBR materials stored in 4 compressed texture arrays (albedo, normal, height, packed AO+roughness+metal). The tile meshes are constructed with an extra 4-byte vertex attribute that specifies texture mapping information per-vertex: (texture layer (i.e. material), triplanar map direction (0, 1, or 2), vertex index within triangle (0, 1, or 2), interpolation mode). Then, in the vertex shader I produce 4 interpolated values which are passed to fragment shader:

barycentric coordinates, which are produced by setting output to (1,0,0) (0,1,0) or (0,0,1) depending on Z component of texture map info.
separate texture map info for each of the 3 vertices.

Vertex shader looks like this:

    if ( vertexMapInfo.a == 0 ) // First vertex in triangle
    {
        lerpBarycentric = vec3( 1.0, 0.0, 0.0 );
        lerpMapInfo0 = vertexMapInfo.xyz;
        lerpMapInfo1 = vec3( 0.0 );
        lerpMapInfo2 = vec3( 0.0 );
    }
    else if ( vertexMapInfo.a == 1 ) // Second vertex in triangle
    {
        lerpBarycentric = vec3( 0.0, 1.0, 0.0 );
        lerpMapInfo0 = vec3( 0.0 );
        lerpMapInfo1 = vertexMapInfo.xyz;
        lerpMapInfo2 = vec3( 0.0 );
    }
    else // Third vertex in triangle
    {
        lerpBarycentric = vec3( 0.0, 0.0, 1.0 );
        lerpMapInfo0 = vec3( 0.0 );
        lerpMapInfo1 = vec3( 0.0 );
        lerpMapInfo2 = vertexMapInfo.xyz;
    }

Then in the fragment shader I reconstruct the uninterpolated per vertex data:

    vec3 invBary = 1.0 / max( lerpBarycentric, 0.000001 );
    vec3 mapInfo0 = floor( lerpMapInfo0 * invBary.x + 0.5 );
    vec3 mapInfo1 = floor( lerpMapInfo1 * invBary.y + 0.5 );
    vec3 mapInfo2 = floor( lerpMapInfo2 * invBary.z + 0.5 );

There is also an OpenGL extension (GL_NV_fragment_shader_barycentric) to do this without such hacks, but it's not available in pre-4.6 OpenGL.

Finally, I can use the per-vertex map info to sample from the texture arrays. This is done 3 times, once for each vertex in the triangle which produced the fragment. Each vertex can have a separate triplanar mapping direction. Finally, I can use the barycentric coordinates to interpolate between the per-vertex texture mapping. This combines the interpolation for triplanar mapping and material layers, and produces very nice results with efficiency comparable to standard per-pixel triplanar mapping. The alternatives would use too many texture samples to interpolate twice.

I use per-vertex information so that I don't need a UV parameterization of the tile mesh. This is important because I intend to eventually use voxels+dual contouring where UV cannot be easily produced at runtime.

PhillipHamlyn01

0

Author

September 05, 2023 04:36 PM

Hi Aressera,

Thanks for the detailed reply - I like your method of interpolating manually for the texture coverage. I will adapt my render loop to do this and it will cut down the number of render samples I do for close-to-camera tiles (for distant-to-camera tiles I am already taking a snapshot of the completed texture and using it as a simple overlay). I currently use texture arrays and sample from them and combine based on weightings; the weightings in my case are calculated at tile generation time, and stored in a structured buffer (of NxN elements). I can adopt your method and pre-calculate the per-vertex result instead of calculating with every pixel draw call. Nice; thank you.

I did also go down the road of procedural generation of the height/coverage map, but found when I wanted to apply linear (river, road) or human (habitation) this could only be done prior to the render stage, as the result of the application of hydraulic erosion and river generation (and the routing of roads and placement of habitations) required the knowledge of more than a single tile, of whatever resolution.

I only use on-disk storage so I can calculate this kind of thing - I'd be very interested in how you integrated these kind of calcs without any permanent storage. I see from your blog that you reference problems in applying erosion globally, and the transition of an erosion “particle” from one Datum tile to another was the issue I could only solve by generating the Datum level tiles one-time and storing them. My erosion (based on the same reference paper as yours) could then freely transition across Datum tiles; as could my river systems.

I also found I had to segment my coordinate system using Datums - very similar to your reference frame ; so I'm really happy to see that this is a common solution pattern.

How many tiles are rendered in your visualisation and how deep does your quadtree go ? From my largest visible tile to the finest detail is a stack of about 15 quadtree LODs levels. The finest detailed tile represents a physical space of about 50m x 50m with a mesh of 128x128 vertexes. I keep the vertex count pretty static for most of the quadtree levels, but I do use fewer per-tile vertexes once we get to the very largest tiles.

All the calculation of visible and available tiles and management of the quadtree is done on a seperate thread in my solution, so doesn't directly affect the render speed. But my issue is that the construction of a tile (of whatever resolution) requires about 10 API calls, both VS/PS and CS. I have to use VS/PS where the data is intrinsically vector based - such as rivers. roads and building platforms, lake basins etc, but I use CS whenever possible to avoid the cost of running the whole render stack. This set of 10 API calls costs about 2ms per tile - which doesn't seem expensive, but it gives me only about 5 “slots” per frame to generate new visible tiles as the user moves closer to them. If you have any techniques or measurements to share about how long your tile generation step takes I'd be really happy to understand them.

Aressera

3,145

September 05, 2023 06:07 PM

PhillipHamlyn01 said:
I did also go down the road of procedural generation of the height/coverage map, but found when I wanted to apply linear (river, road) or human (habitation) this could only be done prior to the render stage, as the result of the application of hydraulic erosion and river generation (and the routing of roads and placement of habitations) required the knowledge of more than a single tile, of whatever resolution.

Those are definitely a challenge. I don't have any roads or structures yet, but I do have plausible rivers which are produced by the erosion algorithm. These arise naturally through the erosion simulation using the hierarchical algorithm discussed in the 2nd blog. The key idea is that coherence between adjacent tiles is enforced by 50% tile overlap and by using the parent tile as a starting point, and by simulating tiles at many levels of detail (20 for an earth-sized planet). Roads and locations for structures could be implemented as part of the erosion algorithm by defining places where terrain should be flattened. The routing would depend on other factors, where the lack of a global map would make it difficult. We'll see how it turns out.

PhillipHamlyn01 said:
How many tiles are rendered in your visualisation and how deep does your quadtree go ?

At surface level on an earth-sized planet, with screen space triangle size of about 6 pixels, I have about 2000 tiles in memory, and about 300-600 of those are rendered in any given frame. The tree goes to around 21 levels deep, with smallest tile of around 2-4 meters (triangle size of <0.1m). I use tiles of 33x33 triangles (34x34 vertices), which corresponds to a 32x32 tile with 1 row of overlap with adjacent 2 tiles to cover seams. The erosion is calculated on a bigger tile that includes 50% overlap (14 px) with the adjacent tiles. Erosion tiles are 32+2*14+1 = 61x61 cells in size.

I don't actually use an explicit quadtree, I store tiles in a hash map accessed by tile ID, where tile ID is (cube face, tree depth, face X, faceY). This allows me to quickly get the parents or children of a tile, whose tile IDs can be easily determined from its own ID.

All my generation is done on the CPU (and is heavily SIMD optimized). The only part involving graphics APIs is where I copy the vertex data into the vertex buffers. This takes about 0.1ms per tile. Currently I do this on the main update thread, which produces significant stuttering, but there are plans to move the bulk of the generation to be async on a thread pool so that the frame rate is smoother.

The time it takes to generate a tile depends on its depth. Currently I do more erosion time steps for bigger tiles, so they take longer (up to around 20-30ms each). The smallest leaf tiles take around 3.5ms to erode. Most of the erosion time is spent managing material layers (hard to optimize). The rest is heavily optimized with SIMD and takes around 0.1ms per erosion time step for 61x61 tile. With a multithread CPU and async generation this should be acceptable for flying near planet surface at many km per second.

Aressera

3,145

September 05, 2023 09:25 PM

I took some screenshots to show current results. Disclaimer: these are nowhere near final, there is a lot of work to do to improve the generation (particularly in choosing materials in a realistic way rather than random). I also don't have water or atmosphere or haze rendering or cascaded shadow maps yet. Those will significantly improve the results. Edit: also I accidentally have anisotropic filtering turned off here, so it looks more blurry than it should.

10km height aerial view looks a little blurry due to vertex interpolation. I plan to fix this by doing a noise-based barycentric interpolation to hide the smooth fades.

Gnollrunner

474

September 06, 2023 03:29 AM

I do runtime terrain generation too. You can look at my blog for an old video, but the shading is super basic. I generate terrain on the CPU but I do a lot of threading. I go directly from functions so there is no storing on disk. I'm using voxels and use a marching algorithm, so I can have caves and underground areas. I'm generating whole planets with many levels of LOD, therefor my chunks need to resize, otherwise when you zoom way out you would only have a couple triangles per chunk.

For physics, I use a “Just In Time” system. Since everything is a series of octrees, it's easy to generate terrain just around the player and throw it out when he moves away. Basically, I pre-sweep the sphere, figure out what voxels I need to realize, and then do the normal sphere/capsule to mesh collision. Everything is done on the CPU and the physics system is completely separate from the visuals in order to avoid any race conditions.

I'm currently working on tree generation. For that I'm planning on using a series of jitter girds at different levels of detail based on the size of the particular foliage.

JoeJ

4,213

September 06, 2023 08:39 AM

PhillipHamlyn01 said:
Is doing procedural landscape tile-by-tile still a good approach, or do people use some other approach for this now ?

The challenge with tiling is that it's restricted to a local area, but natural terrain is result of a global process.
If you build your terrain just from noise functions, that's not a problem because their range is infinite. But if you want a more natural and interesting terrain involving processes such as tectonic plate movements, resulting folds to create mountains, and erosion to create valleys and rivers, tiles won't no longer suffice to simulate such global processes.

But we need tiling because our memory is finite. A solution then is to split generation into various frequencies, combining results hierarchically. At the top level you have a single tile covering the entire world but at very low resolution. That's good enough to simulate plates for example. At the next level you have 2x2 tiles. If you add some overlap to the tile boundaries, you can then simulate erosion and blend results across the boundary. Then you repeat the process for the next level of 4x4 tiles and so on. With each new level you add more details, similar to how Fourier Transform forms a final signal by summing up low frequencies with high amplitudes plus high frequencies with low amplitudes.

This works pretty well, but it still has a limitation: In nature, events on higher frequencies can affect lower frequencies, but we can't do that since our method only works for the other way around.
For example, we may have a problem to form fine rivers to match across tile boundaries, which might not match exactly due to tiled simulation, and then the blending blurs the fine details where they don't match.
To address this, we could generate a global river data structure first, and enforce it in our tiled simulations.

Spinning this thought further, we could think about an entirely different method: We generate planar rivers first, solve for height so water always flows downwards, and finally solve for mountains to be low near the rivers, but high distant to the rivers.
Basically a bottom up method. The idea was researched and gave good results too.

So, coming back to your question, i would say there is no way around tiles or other forms of partition. But there are many ways to do procedural generation within those limitations.
It depends a lot on what you finally want to achieve, how you want to author your terrain, and how much performance you can spend on runtime or offline generation. That's ofc. difficult to plan ahead.
If your world is finite, and you don't want to spend too much time on researching terrain simulation, one attractive option for me would be to use existing terrain tools like Gaea etc. for a coarse global map, and than only procedurally increasing it's detail using displacement. Those tools can export many maps e.g. kind of material (rock or sediment), water flow, etc. Such maps make it easier to add detail.

PhillipHamlyn01

0

Author

September 06, 2023 08:51 AM

@JoeJ - thanks for your insights. I started with infinite noise generated terrain but then hit the problems you describe - the reversal of highly detailed models and structures on the larger terrain ( i.e. rivers, lake basins, roads and other manmade structures).

In my current implementation I pick up the coarse level terrain and gradually refine it in the more detailed tiles, by applying denser fractal and other procedural noise - but at the same time I maintain a set of geometry which overlays the natural terrain - so for instance, a set of vertexes describing a river course or road, and then overlay that geometry onto the natural heightmap to deform it by using Render to Texture or Render to UAV in DirectX.

The course of the rivers, roads etc need to be done offline, as you say; they affect multiple tiles and so during generation time they need access to a limitless set of landscape features such as heights, slopes, etc. I experimented with trying to generate river courses at runtime, but ultimately they were limited to following the contours of the tiles they could ‘see’ leading to the algorithm being unbounded in time; which wasn't compatible with doing something fast at runtime.

Is there any model/procedure that avoids tiling ? I have read of some concepts where the entire visible grid is a single tile, centered on the viewer; with heights pulled in from either storage or algorithm each time the vertex snaps from one XZ location to the adjacent one. I don't know if this would provide any benefits, but it is another way of looking at the problem.

JoeJ

4,213

September 06, 2023 09:57 AM

PhillipHamlyn01 said:
Is there any model/procedure that avoids tiling ? I have read of some concepts where the entire visible grid is a single tile, centered on the viewer; with heights pulled in from either storage or algorithm each time the vertex snaps from one XZ location to the adjacent one. I don't know if this would provide any benefits, but it is another way of looking at the problem.

It's not another way if our problem is generation. Having just the nearby world centered around the camera only helps with runtime performance (rendering, physics, etc.), but it does not help with generation.

PhillipHamlyn01 said:
I experimented with trying to generate river courses at runtime, but ultimately they were limited to following the contours of the tiles they could ‘see’

Let's say you want erosion simulation. (Adressas project shows this is possible at runtime.)

And lets say you want to perform 20 simulation steps, and each step will advance sediment to adjacent tiles. (Which may be a naive assumption, but does not matter for my point.)

For this example we need to extend our tiles by an overlap of 20 cells on each side. So if our tile is 100x100, we have to simulate (100+20+20)^2.
If you do this, tiles will match pretty well across the boundary, at the cost of computing the same spot eventually up to 4 times in an area where 4 extended tiles overlap it.

To form a river crossing many tiles, we can do this because the parent tiles, being larger at less detail, already carved in the coarse shape of the river.

That's basically what we can do for large scale terrains. I assume all professional terrain generation tools work that way if we use them for large terrains. We need the tiles to support multithreading and out of core processing, and there is no way around that.
But this really applies to simulations. Procedural noise or fractals don't have this problem ofc.

Regarding research, almost all i've found involved the authors Eric Galin and/or Eric Guérin. Not much else is around, but they explored a lot of ideas.
Looking at AAA games, the standard method seems to use low res hightmaps with procedural texturing, refined with putting models of rocks and foliage on top of them, often using procedural placement systems.
Even UE5 can not really improve over this, since Nanite can't handle a massive large scale terrain model including unique details at centimeter scales.

The only new idea seems to use machine learning to compose landscapes from samples (e.g. textured hightmaps from satellite images).
Here you can see multiple methods compared: https://github.com/dandrino/terrain-erosion-3-ways

JoeJ

4,213

September 06, 2023 09:58 AM

And this i found good to learn about erosion simulation: https://github.com/LanLou123/Webgl-Erosion

Procedural Landscape - Efficient Runtime Generation

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Procedural Landscape - Efficient Runtime Generation

This topic is closed to new replies.

Popular Topics

Recommended Tutorials

Reticulating splines