aether archive

aether archive

Breakdown of H100s for Transformer Inferencing

Mar 30, 2022
∙ Paid

This new Nvidia GPU just dropped! This post will analyse what it offers for transformer inferencing.

specs

Here's a spec table to start. The "16-bit format" refers to BFLOAT16 and FLOAT16 while "8-bit format" refers to FP8 or INT8. For INT8 they aren't actually flops, because the "fl" is for float, but I'll continue referring to them as flops because we d…

User's avatar

Continue reading this post for free, courtesy of kipply.

Or purchase a paid subscription.
© 2026 kipply · Privacy ∙ Terms ∙ Collection notice
Start your SubstackGet the app
Substack is the home for great culture