I am starting some research on optimizations for my game engine; yes I reached that point. I have a Alienware R18 M2 so this will be fun. First before i go on, I am getting the machine code to as small as possible for cache footprint, and localidy. Now I alread at the same time setting functions / methods to into conserve resource and it directs the load and minimize resources; its nifty seeing high level designs surprise you by the effect and im wondering if this is a novel feature.
I am looking into parsing a syntax tree in OpenSL on my GPU using compute shaders, so GPGPU abuse, and have excessive pipes lines, or special conditions and only having one global one and a giant parse tree created / updated by the video card, so u perform directly on our data stored in kernel space which is my dynamic / static memory systems that allows me to do linking upon a custom heap, and STL replaced by high performance House code; i have only house code for everything except win32 / Posix for Linux.
I am looking into kernel bypass if applicable to my project's problem wish I could share.. And i will focus the same ideas and goals I originally had at the start.. I only scrapped the surface such as async buffer IO with pipes/filters & hoping to finish my global single pipeline, hint its made special, with some extra threads in the thread pool for computation or data transfer by the cpu if randomly needed.
I hoped I shared enough so I may get some novel hints in my head. I will write up some document and post it later here, in this thread. Maybe I will write two books, one on novel Game Engine architecture for dynamic Real Time Games / Projects; my second one on severely optimized by design, code efficiency, cache footprint and seeing the results of locality.
Lastly I almost finished a c compiler, and am studying advanced parsing, https://www.amazon.com/Parsing-Techniques-Practical-Monographs-Computer/dp/1441919015; once completed I will start on my own c++ computer which only will carry the build to assembler output. This, by leveraging the libraries, STL, entry text/code and the other mess of code which will need completed; so I must say I will also write my own linker, i know rare but would give me massive benifits like frame management to the engine not just the loader of the executable, passed on responsibilities to the Engine getting it describe every part, of fascet, of its self.
Does anyone have suggestions, topic to lookup as I will research if I have to, comments or design, or even more specificly architecture advice. Yet, do not be fooled my architecture and general design is very advance and complicated, so any help will do