Logic World Wednesdays: The GPU-Instanced Edition

by @MouseHatGamesDeveloper1 year ago

GPU Instancing - Jimmy

As you have all demonstrated in the months since the game’s release, Logic World players are eager to build as big as they possibly can. I want to make “as big as they possibly can” as big as I possibly can, and so I’ve been working on some fancy new code for rendering large quantities of objects performantly.

In Logic World 0.90, objects are combined into a single big mesh which is then rendered. This is a lot faster than rendering many smaller meshes, but it also has some disadvantages: a lot of RAM is used to store the single big mesh, and a lot of CPU time is spent creating and updating the single big mesh.

My new technique – GPU instancing – has neither of those disadvantages, and is also very significantly faster. It’s also way, way harder and way, way more complicated. Until very recently, I was not skilled enough as a game developer to pull it off.

The basic idea is that you tell the GPU about a single object – say, a cube – and then you ask it to render many separate instances of that object with different properties (such as different positions in 3D space, or different colors). The trick is that you can render millions of instances as a single request to the GPU, which is much faster than millions of individual requests. Furthermore, information shared between instances (such as the mesh) only needs to be sent to the GPU once, which speeds things up a lot.

If you’re curious for more details on how GPU instancing works, you can check out the Unity documentation.

I’ve learned a lot these past few weeks, particularly about how to write HLSL shaders. It took a while to wrap my head around GPU instancing, but I think I’ve finally done it! As programming so often is, it was very frustrating but also very fun.

Here’s a demo video, showing off my new instancing shader and how its performance compares to the previous method.

As you can see, on my system at 3840x2160 the shader can presently do one million cuboids with shadows or four million cuboids without shadows before it dips below 60FPS. These numbers aren’t a hard limit on how big you can build, though: with limited render distance, you might have a world with 40 million objects, but if you can only see 10% of it simultaneously you’ll still have a high framerate.

I’m very excited about GPU instancing. Update 0.91 is going to be a massive performance upgrade for large worlds.

@trucksarenoisy1 year ago

i wonder how well my garbage gpu will like instancing (it’s a nvidia gt 730)

@JimmyDeveloper1 year ago

Regardless of your hardware, the new rendering tech will be faster than the current rendering tech. Everyone will get a performance boost out of this.

@ac101m1 year ago (edited1 year ago)

Very cool! Stopped working on my project because of capacity problems, this should tide me over for the time being ;)

Though I do still think that a mechanism to cover up components on a board would ultimately go a lot further!

@deltabooq1 year ago

Wow. The dedication you put into the game and the skills you learn from it are clearly impressive. Im so happy the game exists :D

@JimmyDeveloper1 year ago

BTW, everyone – there probably will not be a LWW next week, as I have a very busy schedule this week outside of gamedev.

@Ecconia1 year ago

Verrrry nice!
I love to see a near future, where we can render like a million components and wires at the same time with ease!

I do have a technical question, how had you been updating colors of cubes before and after this update?
(Like we have Display Music components and Conductors).

In the API we are literally changing the color of Display meshes. So I assume that is just updating the color of one instance or updating the whole mesh before this update.
How does updating the color of an instance work, is it just changing the data on the GPU (if instance data is stored there - I assume so), with like a ComputeShader? Do conductors get special treatment. Or are they using the same technique?
(Back in the days, there was the idea about supplying an array of states and then only storing an index to that array to generate the black/red color).

@JimmyDeveloper1 year ago

Great question. In 0.90, all geometry is colored using the vertex color of the combined mesh. So, if we want to change the color of a Display, we check and see that this display uses vertices 2469 through 2493 in the combined mesh. We then modify the combined mesh with an updated vertex color array, where all those vertices have been set to the new color.

The method used in 0.91+ with GPU instancing is much more efficient (and sane). Each rendered instance contains a piece of “Color” data, which is used by the instance shader.