Question 1

Which is better for AI inference: A40 or H100 NVL?

Accepted Answer

H100 NVL has 46 GB more VRAM, making it better suited for large models and long context windows. For compute-bound workloads like training, H100 NVL delivers 5.6× higher FP16 throughput. At $0.44/hr vs $1.80/hr, A40 is the more cost-efficient choice for inference.

Question 2

How much VRAM does the A40 have compared to the H100 NVL?

Accepted Answer

The A40 has 48 GB of VRAM (GDDR6), while the H100 NVL has 94 GB (HBM3).

Question 3

Which GPU is cheaper to rent in the cloud, the A40 or H100 NVL?

Accepted Answer

The cheapest on-demand price for the A40 is $0.44/hr, while the H100 NVL starts at $1.80/hr. A40 is the more affordable option.

	A40	H100 NVL
VRAM	48 GB	94 GB
VRAM Type	GDDR6	HBM3
Memory Bandwidth	0.7 TB/s	3.9 TB/s
FP16 Performance	150 TFLOPS	835 TFLOPS
Manufacturer	NVIDIA	NVIDIA
FP8 Support	No	Yes
FP4 Support	No	No

	A40	H100 NVL
$/hr (cheapest)	$0.44✓ best	$1.80
$/TFLOP (compute value)	$0.0029	$0.0022✓ best
$/GB VRAM (memory value)	$0.0092✓ best	$0.0191

Provider	On-demand	Spot	Rent
Vast.ai	$1.80/hr	—	Rent
RunPod	$2.89/hr	$2.39/hr	Rent
Lambda	$3.29/hr	—
Nebius	$3.85/hr	$2.15/hr
Google Cloud	$4.20/hr	$1.15/hr
Amazon Web Services	$6.88/hr	$2.53/hr
Microsoft Azure	$6.98/hr	$6.98/hr

A40 vs H100 NVL

Verdict

Specifications

Price / Performance

Cloud Pricing

A40

H100 NVL

Model Compatibility

A40 (1000 models)

H100 NVL (1000 models)

You might also compare…