June 30, 2021

Shader Pipeline

The first shaders were very simple programs allowing to transform and light vertices before a rasterization stage. They were written in an assembly language. Now shaders can be found everywhere, from the user interface to physics and logic, because they are not any different from other source code. The number of shaders is growing every year with the flexibility of modern GPUs. And software delegates more and more tasks to GPU instead of CPU.

There are not many ways of writing shaders nowadays. It can be HLSL/GLSL/MSL/WGSL/CU/CL dialect. But mostly, the code will be the same with some minor differences. It’s not possible to create a perfect binary shader that will run across all available hardware. Because GPUs have different architectures, it’s impossible to make optimal binary format for everybody. So, an intermediate binary representation is used to simplify the application runtime. And it’s the driver’s job to generate the perfect binary for the input intermediate shader representation.

With cross-API technology, you need different shaders for different APIs. Moreover, some platforms do not allow compiling shaders during the application’s execution and require precompiled shader input. In the case of Vulkan, it’s SPIR-V format, which is a binary representation of GLSL shader code. And that binary shader can be directly loaded by Vulkan runtime, and the driver will transform it for the hardware. OpenGL ARB_gl_spirv extension allows loading the same SPIR-V binary shader directly to the OpenGL runtime with minor modifications related to the samples and textures. Unfortunately, it is only working for Nvidia. AMD and Intel can’t handle geometry and tessellation shaders from the SPIR-V binary.

The problems will appear when the same shader needs to be run on Direct3D12, Metal, or WebGPU APIs. SPIR-V cross tools make it possible by translating binary shaders to HLSL or MSL formats with many tweak parameters for resource binding. That translation is not fully compatible between platforms, so the engine must know how to transform parameters best for each platform. After that, shader source code can be compiled by d3dcomiler/dxcompiler/metal toolset for the required runtime.

Another option is to use HLSL shaders as input and cross-compile them to SPIR-V representation for Vulkan. But that also requires resource binding magic. Every platform wants to have its own shader language, which is not compatible with other platforms. SPIR-V is a great attempt to make a standard for everybody. But only Vulkan API can use it. Other platforms require different shader languages or binary formats.

The number of different shader types is also growing. We have Vertex, Fragment, Geometry, Control, Evaluate, Compute, Task, Mesh, RayGen, RayMiss, Closest, AnyHit, Intersection, and Callable shader types. All of them have different input and output semantics. Luckily, Khronos group provides tools to validate and compile all of these shaders. We tried to use these tools, but, unfortunately, it was impossible to cross-compile our Compute, Tessellation, and Geometry shaders with their help. So, we’ve created our own shader pipeline based on GLSL and SPIR-V specifications. And now we are excited to tell you more about it.

We use the GLSL language in our Tellusim Engine as a primary language for all platforms. All shader types, including Mesh and Raytracing, are supported. Because of the high performance of our shader toolset, we can skip the offline shader compilation step and do everything during runtime. And it works fast with any amount of code. For platforms that are not allowing to compile shaders at runtime, we use precompiled shader cache.

GravityMark GPU benchmark requires more than 20K lines of GLSL shaders. And there is a huge difference in the time needed to start the application when Khronos glslang compiler is used for GLSL to SPIR-V compilation.

Here is a log from build with Khronos glslang compiler:

M:  63.32 ms: Creating 1600x900 Vulkan Window
M:   1.493 s: Creating SceneManager
M:   9.346 s: Creating RenderManager
M:  12.431 s: Creating Scene
M:  13.551 s: Creating 200,000 Asteroids
M:  13.701 s: Updating Scene
M:  13.851 s: GravityMark v1.2 is Ready in 13.9 s

And this is Clay shader compiler doing the same job 10 times faster:

M:  58.59 ms: Creating 1600x900 Vulkan Window
M: 288.47 ms: Creating SceneManager
M: 411.18 ms: Creating RenderManager
M: 541.40 ms: Creating Scene
M:   1.289 s: Creating 200,000 Asteroids
M:   1.364 s: Updating Scene
M:   1.500 s: GravityMark v1.2 is Ready in 1.5 s

There is no difference in FPS between shader compilers.

Conversion to other shader languages from SPIR-V representation is performed with the same incredible speed. Moreover, all resource bindings are automatically handled by the engine. Only one GLSL shader is needed for all supported platforms, including Cuda and WebGPU. This gives great flexibility and significantly reduces the time it takes to develop new features. We also use all available debugging tools from supported platforms.

Some of GLSL features, such as embedded arrays, are not supported because we don’t need them.

You can download the latest Clay shader compiler command-line tool for Windows and Linux with all shader languages back ends (Vulkan SPIR-V, OpenGL SPIR-V, OpenGL GLSL, OpenGLES GLSL, Direct3D12 HLSL, Direct3D11 HLSL, WebGPU WGSL, Metal MSL, Cuda, and Hip) here: