[info] How to calculate GPU's upper bound in computing

Ref: https://mp.weixin.qq.com/s/XwaboXWn3S9SkZQ7aU3RfQ

The released GPU cards:

CUDA-GPU
Volta/Turing/Ampere

PeakFLOPs = F_clk * N_sm * T_ins * 2

where Fclk is the running frequency, N_sm is the # of GPU SM, T_ins for the latency for the specific data type, 2 -> (multiplication and addition are 2x float operations)

For example, for A100 FP32 CUDA core, Tins=64, Fclk = 1.41GHz, Nsm = 108

Peak_FLOPS = 1.41 * 108 * 64 * 2 = 19,491 GFLOPS

FLOPSreal = Total FLOPS / T_calc

Last updated

Was this helpful?