Arjun Sunil
1 min readJun 22, 2020

--

You got that correctly @coxidev.

But apart from the fact that the nodes aren't running when there are no notebooks being used, we're also using spot instances that helps cut down the cost significantly.

Due to scale down to zero, you're not tied down to a fixed monthly cost of having to run a GPU node 24*30 (24 hours full month while you ran a notebook for 10 hours in total)

As for the numbers, unfortunately, I don't have the numbers for how much cost reduction we got due to downscale to zero; since we deployed the cluster with downscaling enabled by default.

If you have an average estimate for the number of hours you're going to train, you could just calculate an estimated cost using hourly instance cost from here: https://aws.amazon.com/ec2/spot/pricing/

If you ensure that the notebook servers are shut down as soon as your training is done, the amount you calculate from above should be in sync with actual numbers.

I hope that answers all your questions.

--

--

Arjun Sunil
Arjun Sunil

Written by Arjun Sunil

Tinkerer by instinct, MLOps Engineer by trade ✌🏾 I solve real-world problems using Tech, AI & 3D printing. Connect: connect@arjunsunil.com

No responses yet