December 16, 2021

Mesh Shader Performance

We have some new questionable results of Mesh Shader and MultiDrawIndirect performance on AMD GPUs. Let’s look at the situation when we render geometry using independent triangles without index buffer. VS Draw Array column represents this rendering mode where the number of Vertex Shader invocations equals the number of primitives times 3. Technically, this mode does the same as Mesh Shader rendering mode but without generating indices. That’s why it’s surprising to see that it’s faster than MultiDrawIndirect and Mesh Shader on 6700 XT. So if you are optimizing geometry for vertex cache or generating the best possible meshlets, it makes no sense to do it for AMD GPUs. Other GPUs can share “Vertex Shader” output between adjacent triangles.

VS Draw Elements VS Draw Arrays MDI MS CS
Radeon 5600 M 5.0 B 1.5 B 1.1 B 8.3 B
Radeon 6700 XT 14.5 B 4.8 B 4.1 B 4.6 B 19.5 B
Radeon 6900 XT 17.6 B 7.2 B 4.1 B 9.1 B 34.5 B
GeForce RTX 2080 Ti 12.3 B 5.6 B 12.5 B 13.3 B 18.3 B
GeForce RTX 3090 14.3 B 5.9 B 14.6 B 20.7 B 28.8 B
Intel DG1 1.3 B 227 M 1.1 B 2.5 B
Apple M1 1.4 B 556 M 930 M 2.5 B

Compute versus Hardware
Mesh Shader versus MultiDrawIndirect