Question 1

Which is better for AI inference: L40 or RTX 5090?

Accepted Answer

L40 has 16 GB more VRAM, making it better suited for large models and long context windows. For compute-bound workloads like training, RTX 5090 delivers 1.2× higher FP16 throughput. At $0.44/hr vs $0.82/hr, RTX 5090 is the more cost-efficient choice for inference.

Question 2

How much VRAM does the L40 have compared to the RTX 5090?

Accepted Answer

The L40 has 48 GB of VRAM (GDDR6), while the RTX 5090 has 32 GB (GDDR7).

Question 3

Which GPU is cheaper to rent in the cloud, the L40 or RTX 5090?

Accepted Answer

The cheapest on-demand price for the L40 is $0.82/hr, while the RTX 5090 starts at $0.44/hr. RTX 5090 is the more affordable option.

	L40	RTX 5090
VRAM	48 GB	32 GB
VRAM Type	GDDR6	GDDR7
Memory Bandwidth	0.9 TB/s	1.8 TB/s
FP16 Performance	181 TFLOPS	210 TFLOPS
Manufacturer	NVIDIA	NVIDIA
FP8 Support	Yes	Yes
FP4 Support	No	Yes

	L40	RTX 5090
$/hr (cheapest)	$0.82	$0.44✓ best
$/TFLOP (compute value)	$0.0045	$0.0021✓ best
$/GB VRAM (memory value)	$0.0171	$0.0138✓ best

Provider	On-demand	Spot	Rent
Vast.ai	$0.44/hr	—	Rent
RunPod	$0.99/hr	$0.99/hr	Rent

L40 vs RTX 5090

Verdict

Specifications

Price / Performance

Cloud Pricing

L40

RTX 5090

Model Compatibility

L40 (1000 models)

RTX 5090 (1000 models)

You might also compare…