I Optimised My Game Engine Up To 12000 FPS

509,225
0
Published 2024-03-17
The source code and demos are available here: patreon.com/vercidium
The greedy meshing algorithm is available here: github.com/vercidium-patreon/meshing

I spent the past 6 years creating a game engine, and I've been shocked at the things that can make or break performance.

I put together 4 simple optimisations that you can use to make your games run much quicker.

The music is Saturation, Spectrum, Shadows Azure and Chroma-quay by Disjoint Square
disjointsquare.bandcamp.com/

Timestamps
0:00 Intro
0:33 Massive Meshes
1:56 Tiny Triangles
4:51 Particle Perfection
7:34 Dont Talk So Much
9:22 Limit Breaker
10:43 The Finals
11:33 Optimise Further

#gamedev #gamedevelopment #gameengine

All Comments (21)
  • @Chizypuff
    Imagine somebody complaining on the forums about lag, then the next update they can play in 8k 200fps on their 10 year old setup. Insane.
  • @astreakaito5625
    Just unironically assume your game need to run ok on a PS2
  • @astridwilde
    I've made a voxel engine before and literally half of these optimizations never even occurred to me. Bravo
  • @user-pt2rv6br6w
    Damn, I can't imagine the amount of work you put to visualize all these incredible techniques.
  • @weidiocrow
    So to summarize - Core principles: - CPU cycles expensive, GPU cycles cheap so we want to reduce CPU cycles by: - Passing as little data as possible to the GPU, which costs CPU cycles - Doing as few draw calls as possible to avoid redundant CPU cycles - Voxels have certain principles we can leverage mainly: - They exist at discrete coordinates in space - They're cubes, so only three sides are visible at a time - They have a uniform size - Techniques - Avoid drawing geometry the player won't see: - Internal faces - Backfaces(the counterclockwise vertices strips) - The three(or more) faces of the chunk the player can't see - Batch draw calls and info - Combine adjacent voxels into one mesh, forming a chunk - Combine adjacent faces sharing a normal to form "runs" that can be drawn as one face - Using instances, draw strips instead of triangles, - Use the symmetry of a cube to flip one strip into all six positions - Send one indirect buffer to the GPU in one draw call instead of by chunk - Reduce memory usage - Using the fact that voxels use discrete coordinates, ditch floats. - Pack location information, normal enum, length/width run info, and texture id into one 32-bit number - Use the fact that we're chunking in combination with the SSBO to further reduce information individual voxels must hold, so now voxel world info is only meaningful with the chunk info Wonderful video. The best part of this is how you took something "toyish" like voxels and showed off many optimization techniques that would be much hard to reason about otherwise.
  • @Blxz
    Not using the Todd Howard school of thought where you instead just blame gamers for not having modern high powered computers?
  • @joshuanorman2
    Having just purchased a 24,000Hz monitor, I'm very disappointed in the lack of performance
  • @sIkLeGAMING
    Insane production quality! Explaining complex problems so precisely is quite the skill. You are as good at presenting what you know as you know it. Well played brother
  • @stickguy9109
    I wish Minecraft had these optimizations it's devouring my ram and anything beyond 10 chunks of render distance will dip below 20 fps
  • @pseudo_goose
    I already knew about a lot of this, but you demomstrated it so well that I couldn't stop watching. And then you dropped a BOMBSHELL of one mesh per face direction! I've always wondered how you could bulk discard those faces, and now its so obvious
  • Edit: I have unintentionally started a war in replies and I feel guilty My brain is melting and thinking "How did nobody think of this earlier?" at the same time The amount of data that can be stored in a single 32bit integer is amazing
  • @budgetarms
    A few things I want to mention, not ever game uses clockwise order for culling. Just an example, DirectX and Opengl dont have the same winding order. And yes, basically every game uses culling in order to improve their game. Disregarding the fact that with some textures, fire, leafs, windows, the culling will be set to none.
  • @DevDunkStudio
    This is so impressive to see! Thank you for putting this much effort and knowledge into an understandable video. Hope to make these kinds of videos one day
  • @sentient9478
    I'm pretty sure I understood maybe 10% of what was said in this video. On the bright side I'm now 10% more educated on how to optimize game engines using voxels. Great video. :)
  • @Mad3011
    Awesome video! Actually there is a way to "break up" triangle strips. When using index buffers you can set a special "Primitive Restart Index" value that the gpu interprets as the end of a triangle strip!
  • @SatisfiedOnion
    Somehow you've outdone your previous videos... wow. This was incredibly easy to follow for a layman like me! One thing I'd be really interested in is learning about how you optimize for file sizes. Games these days are seemingly unapologetically massive. The recent Star Wars Classic Collection launch saw a (I think) 10x increase in file size without really anything added
  • @appc23
    5:28 Btw what you were looking for there was Primitive Restart. Allows to restart a strip without using degenerate triangles.
  • @ordinarygg
    This is just insane quality of material/video/description/music/idea and result! Amazing!! Wish you huge success! And joining your patreon!
  • @tommycard4569
    Such an underrated channel. I've seen a lot of voxel rendering videos and this by far takes the cake. clear visualization, concise explanation, and thorough coverage of the topic. Very engaging for how informative it is