Breakdown of H100s for Transformer Inferencing

Mar 30, 2022

∙ Paid

This new Nvidia GPU just dropped! This post will analyse what it offers for transformer inferencing.

specs

Here's a spec table to start. The "16-bit format" refers to BFLOAT16 and FLOAT16 while "8-bit format" refers to FP8 or INT8. For INT8 they aren't actually flops, because the "fl" is for float, but I'll continue referring to them as flops because we d…

Continue reading this post for free, courtesy of kipply.

Or purchase a paid subscription.

aether archive

Breakdown of H100s for Transformer Inferencing

specs

Continue reading this post for free, courtesy of kipply.