December 16, 2021

Mesh Shader Performance

We have some new questionable results of Mesh Shader and MultiDrawIndirect performance on AMD GPUs. Let’s look at the situation when we render geometry using independent triangles without index buffer. VS Draw Array column represents this rendering mode where the number of Vertex Shader invocations equals the number of primitives times 3. Technically, this mode does the same as Mesh Shader rendering mode but without generating indices. That’s why it’s surprising to see that it’s faster than MultiDrawIndirect and Mesh Shader on 6700 XT. So if you are optimizing geometry for vertex cache or generating the best possible meshlets, it makes no sense to do it for AMD GPUs. Other GPUs can share “Vertex Shader” output between adjacent triangles.

VS Draw Elements | VS Draw Arrays | MDI | MS | CS | |
---|---|---|---|---|---|

Radeon 5600 M | 5.0 B | 1.5 B | 1.1 B | 8.3 B | |

Radeon 6700 XT | 14.5 B | 4.8 B | 4.1 B | 4.6 B | 19.5 B |

Radeon 6900 XT | 17.6 B | 7.2 B | 4.1 B | 9.1 B | 34.5 B |

GeForce RTX 2080 Ti | 12.3 B | 5.6 B | 12.5 B | 13.3 B | 18.3 B |

GeForce RTX 3090 | 14.3 B | 5.9 B | 14.6 B | 20.7 B | 28.8 B |

Intel DG1 | 1.3 B | 227 M | 1.1 B | 2.5 B | |

Apple M1 | 1.4 B | 556 M | 930 M | 2.5 B |

Compute versus Hardware

Mesh Shader versus MultiDrawIndirect