Question 1

Which is better for AI inference: L4 or L40?

Accepted Answer

L40 has 24 GB more VRAM, making it better suited for large models and long context windows. For compute-bound workloads like training, L40 delivers 1.5× higher FP16 throughput. At $0.39/hr vs $0.82/hr, L4 is the more cost-efficient choice for inference.

Question 2

How much VRAM does the L4 have compared to the L40?

Accepted Answer

The L4 has 24 GB of VRAM (GDDR6), while the L40 has 48 GB (GDDR6).

Question 3

Which GPU is cheaper to rent in the cloud, the L4 or L40?

Accepted Answer

The cheapest on-demand price for the L4 is $0.39/hr, while the L40 starts at $0.82/hr. L4 is the more affordable option.

	L4	L40
VRAM	24 GB	48 GB
VRAM Type	GDDR6	GDDR6
Memory Bandwidth	0.3 TB/s	0.9 TB/s
FP16 Performance	121 TFLOPS	181 TFLOPS
Manufacturer	NVIDIA	NVIDIA
FP8 Support	Yes	Yes
FP4 Support	No	No

	L4	L40
$/hr (cheapest)	$0.39✓ best	$0.82
$/TFLOP (compute value)	$0.0032✓ best	$0.0045
$/GB VRAM (memory value)	$0.0163✓ best	$0.0171

Provider	On-demand	Spot	Rent
RunPod	$0.39/hr	$0.39/hr	Rent
Google Cloud	$0.56/hr	$0.17/hr
Amazon Web Services	$0.80/hr	$0.13/hr

L4 vs L40

Verdict

Specifications

Price / Performance

Cloud Pricing

L4

L40

Model Compatibility

L4 (1000 models)

L40 (1000 models)

You might also compare…