[Microsoft Azure Cloud] : How to create low priority VM machines ?

Go MasterZero

unread,

Nov 5, 2018, 5:50:17 AM11/5/18

to LCZero

Hi everyone,

I'm trying to create low cost low priority instances on microsoft azure cloud, but no matter where i looked, i couldnt find this setting

It should be available as stated here :

https://azure.microsoft.com/en-us/blog/low-priority-scale-sets/

And hinted here too :

https://youtu.be/dRujj2oQA_g?t=225

On google Cloud though, it is very easy (preemptible instances is just a Yes/No setting)

Anyone knows ?

Also, what is the difference between a VM machine and a scale set ?

Is it the same as instance groups of google cloud ? or are google "instance groups" the same as azure "ressource groups" ?

Thanks.

LuckyDay

unread,

Nov 5, 2018, 7:07:24 AM11/5/18

to LCZero

There are no pre-emptible instances offered by Microsoft azure as far as i know, which is what makes it inferior to google cloud.

It is still certainly useable though, but at their pricings you can only rent a V100 for around 70 hours or so before using up all credit.

Matt Blakely made a guide on how to set it up in the forums

https://groups.google.com/forum/#!topic/lczero/OnHcUB-00ZQ

Go MasterZero

unread,

Nov 5, 2018, 10:57:59 AM11/5/18

to LCZero

@LuckyDay

@ Matt Blakely

i see, i will try to dig into that deeper then

1) by the way, i'd like to mention that in leela-zero project (where i come from), we use automated instructions with managed instance groups to automatically recreate preemptible instances after they are preempted, that you should find helpful if you didnt know it can be used : https://docs.google.com/document/d/1P_c-RbeLKjv1umc4rMEgvIVrUUZSeY0WAtYHjaxjD64/edit

2) which leads to my 2nd question :

if we forget the question of low priority costs, is it possible to create the equivalent of "google cloud" managed instance groups as explained : meaning a group which fulfills the mission of automatically recreating and restarting the number of instances we want if they are deleted or preempted (in google cloud : 1 due to gpu quota =1 on free trials)

thanks again

note : also, it seems to me that the tesla p100 are more cost efficient on microsoft azure than v100, but didnt run tests yet

Message has been deleted

Go MasterZero

unread,

Nov 5, 2018, 11:24:59 AM11/5/18

to LCZero

after some digging, it seems you can create low priority VM machines on microsoft azure if you use a batch :

https://azure.microsoft.com/en-us/pricing/details/batch/

quote :

"Azure Batch provides job scheduling and cluster management, allowing applications or algorithms to run in parallel at scale.

There’s no charge for Batch itself, only the underlying compute and other resources consumed to run your batch jobs, including applicable software license costs. For compute, Cloud Services, Linux Virtual Machines, or Windows Virtual Machines can be utilized by Batch. The standard rates for compute apply and can be viewed below and software licensing costs for batch graphics and rendering are available below. In addition, Batch allows low-priority virtual machines (VMs) to be used. Reserved Virtual Machine Instances are available when using the Azure Batch Service in User subscription pool application mode."

Message has been deleted

Go MasterZero

unread,

Nov 5, 2018, 12:09:34 PM11/5/18

to LCZero

edit : my bad, v100 are actually cheaper with low priority :

full details here in documentation : "what is azure batch, documentation" :

https://docs.microsoft.com/en-us/azure/batch/batch-technical-overview

low priority.PNG

Message has been deleted

Go MasterZero

unread,

Nov 5, 2018, 12:37:34 PM11/5/18

to LCZero

full details here : "what is azure batch, documentation" :

https://docs.microsoft.com/en-us/azure/batch/batch-technical-overview

https://docs.microsoft.com/en-us/azure/batch/quick-create-portal

https://docs.microsoft.com/en-us/azure/batch/batch-low-pri-vms

Matt Blakely

unread,

Nov 5, 2018, 1:40:59 PM11/5/18

to LCZero

Interesting will have to check it out

I know about azure batch from work but hadn't considered to use it here

It may require a minimum size or similar that ultimately doesnt save money but it's worth checking it out

Go MasterZero

unread,

Nov 5, 2018, 1:58:16 PM11/5/18

to LCZero

yes, interesting indeed

this requires some time reading it and some testing, but whenever i find a solution i'll come back to you with it

LuckyDay

unread,

Nov 5, 2018, 3:57:18 PM11/5/18

to LCZero

yes i have heard that the leela zero project has automated instances to restart pre-emptible ones.

at present though, it looks like google cloud has disabled use of GPUs for free trials, requiring people to sign up to paid accounts to use the $300 credit.

Having automated instances would run the risk of people exceeding their credit and getting charged;

in that way, the 24 hour limit of the pre-emptible instances then at least acts as something of a safety net imo.

getting low-priority V100s set up for azure trials would be great if such a thing exists, as test40 will need a boost

Go MasterZero

unread,

Nov 6, 2018, 8:14:38 AM11/6/18

to LCZero

update :

Create a batch account if you dont have one, as explained in the above documentation.

Then, go in :

All ressources -> Batch account -> Pools -> Add

For some reason i could only use NC6 (half tesla k80) as nc6v2 (tesla p100) and nc6v3 (tesla v100) were not allowed, i guess it uses a different quota

But still, creating a pool of low priority nodes (= a group of multiple low priority VM instances) is fairly easy with azure batch as explained in the documentation i sent above.

Vocabulary :

batch account = an account to create pools of nodes

pool = equivalent of a "google instances group"

node = a VM machine/an instance that belongs to a group, regardless of whether it is full cost or low priority

To run testing, i chose 0 dedicated nodes and 2 low priority nodes with 2 nc6 (k80)

Creating and managing nodes is easy, but !

The problem is with setting startup tasks (equivalent of cloud-init)

What a mess this is !

in Pools-> yourpool1 -> Start Task

The only command that i managed to make successfully run at startup was :

sudo apt-get update

as you can see here :

update works.PNG

While my script was running fine, i ran with all kind of errors on this azure thing, including :

- #!/bin/bash simply doesnt work (i saw in some docs they use /bin/bash)

- sudo apt-get update followed by any command returns : error no argument can be used after update

- submodule update --init --recursive also returns an error

- sudo add-apt-repository -y ppa:graphics-drivers/ppa also returns a not than one repository error

everytime i tried a different command, i would reboot (with terminate option) all nodes

to try a different command, just go back to Pools-> yourpool1 -> start task menu

change to whatever you want, then click on the box above to be able to make the "save" button appear, save, then reboot your nodes with "terminate" option

you can also manually delete your nodes and use the "scale" option to recreate them; they will then run the startup script you defined

the only command i managed to make working so far was sudo apt-get update

i know this is not related to lczero, but this script was running fine on VM machine creation

(just a quick copy paste from the google cloud script so not "pretty", but has been tested to work on azure VM machine manual creation with cloud-init)

#!/bin/bash
sudo apt-get update && sudo apt-get -y upgrade && sudo apt-get -y dist-upgrade && sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev && sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl && git clone https://github.com/gcp/leela-zero && cd leela-zero && git checkout next && git pull && git clone https://github.com/gcp/leela-zero && git submodule update --init --recursive && mkdir build && cd build && cmake .. && cmake --build . && cd ../autogtp && cp ../build/autogtp/autogtp . && cp ../build/leelaz . && sudo apt-get -y install glances zip && sudo apt-get clean && sudo -i && cd /leela-zero/autogtp && ./autogtp

as you can see here, it works flawlessly in manual VM mode :

cloud init works.PNG

you can see a command's log in :

Pools-> yourpool -> Nodes -> yournode1 -> Files :

1) stdout.txt : this works similarly than serial log of manual vm machine creation as shown above with sudo apt-get update working, and here again the screeshot

2) stderr.txt : if there is any error with the startup task, it will appear in this file

update works.PNG

Then the nodes goes back to idle state as i failed to make them run any script successfully (except the very basic one, but it's still a success !)

have you been successfull with any script or command ?

i was trying with ubuntu 18.04, will try with 16.04 now

Go MasterZero

unread,

Nov 6, 2018, 8:32:33 AM11/6/18

to LCZero

example of error i'm getting with stderr.txt

stderr init error.PNG

Go MasterZero

unread,

Nov 6, 2018, 10:29:11 AM11/6/18

to LCZero

you may find this script azure batch documentation helpful :

https://docs.microsoft.com/en-us/azure/batch/scripts/batch-cli-sample-run-job

Matt Blakely

unread,

Nov 6, 2018, 1:21:16 PM11/6/18

to LCZero

Try ubuntu 16 that's what we usually use in Google and what I used for my non batch azure vms

LuckyDay

unread,

Nov 6, 2018, 2:34:35 PM11/6/18

to LCZero

Very promisng stuff esp with test 40 coming up in a few weeks maybe.

Id also like to mention that it was mentioned to me IBM seems to offer a cloud service as well; it has a free/lite version that from what i gather essentially offers you ~25 hours of k80 compute time monthly for free. Fairly modest but potentially worth setting up.

https://www.ibm.com/cloud/machine-learning/pricing

Go MasterZero

unread,

Nov 6, 2018, 2:38:40 PM11/6/18

to LCZero

i already managed to make it working, i'll update the pictures in a few minutes

Go MasterZero

unread,

Nov 6, 2018, 2:41:24 PM11/6/18

to LCZero

so you need a special syntax for the script, as shown in their documentation :

the script below works until game starts (i assumed it is an opencl error due to restart needed) :

test 15-2.PNG

test 15-3.PNG

test 15-31.PNG

test 15-4.PNG

test 15-5.PNG

Go MasterZero

unread,

Nov 6, 2018, 2:44:20 PM11/6/18

to LCZero

this is the script that was shown to work :

/bin/bash -c 'sudo -i && uname -a && sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev && sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl && git clone https://github.com/gcp/leela-zero && cd leela-zero && git submodule update --init --recursive && mkdir build && cd build && cmake .. && cmake --build . && cd ../autogtp && cp ../build/autogtp/autogtp . && cp ../build/leelaz . && clinfo && ./autogtp'

and this is the script i'm currently testing to implement an autoreboot (works succesfully on google cloud) :

/bin/bash -c 'PKG_OK=$(dpkg-query -W --showformat='${Status}\n' glances|grep "install ok installed")
echo Checking for glanceslib: $PKG_OK
if [ "" == "$PKG_OK" ]; then
  echo "No glanceslib. Setting up glanceslib and all other leela-zero packages."
  sudo -i && uname -a && sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev && sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl && git clone https://github.com/gcp/leela-zero && cd leela-zero && git submodule update --init --recursive && mkdir build && cd build && cmake .. && cmake --build . && cd ../autogtp && cp ../build/autogtp/autogtp . && cp ../build/leelaz . && clinfo && sudo apt-get -y install glances && sudo reboot
else 
  sudo -i && cd /leela-zero/autogtp && ./autogtp
fi'

Go MasterZero

unread,

Nov 6, 2018, 2:47:49 PM11/6/18

to LCZero

here is the opencl error i mentionned earlier :

worked 6.PNG

Go MasterZero

unread,

Nov 6, 2018, 2:54:16 PM11/6/18

to LCZero

about ubuntu 16.04, since it was working with ubuntu 18.04 on google cloud, i saw no reason to change that

what works on 18.04 should also work on 16.04 unless specific cases

so now the 2 remaining issues to deal with are :

1) how to overcome the opencl error

with manual VM machine creation, there was no need to reboot, as you can see here, it works successfully :

cloud init works.PNG

2) how to have nc6v2 and nc6v3 available :

for some reason i get access denied when trying nc6v3 :

not allowed.PNG

it should work though

Message has been deleted

Go MasterZero

unread,

Nov 6, 2018, 2:56:00 PM11/6/18

to LCZero

so far i was only using nc6 (half tesla k80)

if you try any script variation, let me know if you find any script working without needing to reboot to initialize opencl and nvidia driver

i'll continue next time

Matt Blakely

unread,

Nov 6, 2018, 3:04:00 PM11/6/18

to LCZero

GoMaster -> If you can't deploy a certainVM class it's probably befcause you have to request access. By default on these non-enterprise pay as you go accounts many VM SKUs are not enabled by default.

Goto Subscriptions-> Usage + Quotas

Use the filters to find the VM class, and I bet it shows 0 as your quota

If so, use "Request Increase" to file a support request. Just do 24x7 via email support, and they usually do it within 4 hours (its automated unless you request alot).

If you don't get the option to request an increase then you need to upgrade to a "pay-as-you-go" account, but I believe you already did that

Go MasterZero

unread,

Nov 6, 2018, 3:11:25 PM11/6/18

to LCZero

i already increased quota for nc6v2 and nc6v3 to 100

i believe this issue is related to azure batch :

regardless of this, i think using the "wait for sucess" to "true" and then creating a scheduled job may do the trick, with a script like that (will try next time) :

/bin/bash -c 'sudo -i && uname -a && sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev && sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl && git clone https://github.com/gcp/leela-zero && cd leela-zero && git submodule update --init --recursive && mkdir build && cd build && cmake .. && cmake --build . && cd ../autogtp && cp ../build/autogtp/autogtp . && cp ../build/leelaz . && clinfo'

then in azure batch -> jobs (or scheduled jobs), set up a job

will try next time

Go MasterZero

unread,

Nov 7, 2018, 5:41:12 AM11/7/18

to LCZero

good news !

it works !

after some digging, i found that "sudo reboot" breaks the startup task script, so it is better not to use it, as it starts at EVERY boot (while we need it only at boot 1 to install gpu drivers for clinfo to successfully detect gpu; without reboot clinfo returns "number of platforms 0"

as the comment is quite long and much easier for me to edit on github, with many screenshots, i'm linking you to the github comment i did (many screenshots) :

https://github.com/gcp/leela-zero/issues/1905#issuecomment-436572506

if you have any question or improvement you want to suggest, i'd be glad to improve it too

last thing, i just emailed microsoft support : for account quotas, this is the form they requested me to fill, waiting for their answer :

Hello,

Thanks for your response.

As per your previous email I can understand you want us to increase the batch account cores for Ncsv3 and Ncsv2 series.

Request you to please confirm below details so that we can go ahead and engage capacity team.

Batch Account Name: testp100

Region: West Europe

VM Type: NC6v3 (and NC6v2 too)

Additional Quantity: 0 (Please add Zero if you do not have the current Limit)

Current Limit: 0 (Please add Zero if you do not have the current Limit)

New Limit: 50 cores (for NC6v2 : 50 cores too)

Dedicated or Low priority : Low priority for all the 50 cores (same for the NC6v2)

Best Regards,

LuckyDay

unread,

Nov 7, 2018, 7:03:31 AM11/7/18

to LCZero

out of interest, what sort of performance (game generation) are you getting, and how much is it costing to run?

I understand with test30 game generation being the default, that would reduce game generation by up to 40% so for comparison, cloud V100s would generate around 650 games/hour or 15-16k games daily

Go MasterZero

unread,

Nov 7, 2018, 7:40:18 AM11/7/18

to LCZero

prices are much cheaper with low priority nodes, as you can see here :

https://azure.microsoft.com/en-us/pricing/details/batch/

price nc6.PNG

price nc6v2.PNG

price nc6v3.PNG

as you can see it's just day and night for prices !

what interests us are these :

- nc6 (half tesla k80) : 0.233 vs 1.166 $/hour (80% cheaper)

- nc6v2 (tesla p100) : 0.537 vs 2.682 $/hour (80% cheaper)

- nc6v3 (tesla v100) : 0.796 vs 3.978 $/hour (80% cheaper)

it is indeed much much cheaper, and what's more, entirely automated

performance is the same as half a tesla k80 (but to give precise stats, i'd need to have a bunch of 5% resign games only games in one hour, and half tesla k80 is quite weak to handle -g 2 to being with -2 games simultaneously)

Go MasterZero

unread,

Nov 7, 2018, 7:48:29 AM11/7/18

to LCZero

here is a screenshot of our 40b leela zero autogtp game production to give a rough idea (nc6 is half a tesla k80) :

game production was set with -g 2 (as i know from my testing it is 25% faster than one game at a time on a tesla V100 on google cloud), but here as one game was a no resign game, it basically cut in half speed production on this half tesla k80

performance test.PNG

LuckyDay

unread,

Nov 7, 2018, 8:19:46 AM11/7/18

to LCZero

interesting, those low priority prices look quite comparable to google cloud prices, only maybe marginally higher. Are those low priority prices compatible with the free credit that azure provides?

also, if i'm reading your script correctly, you're using opencl for leela zero (the go version)? is there a particular reason you're not using cuda since that should probably work better for these nvidia gpus?

with leela chess zero there is quite a stark difference between running cuda and running opencl, particularly if you can utilise fp16 like for the V100 or 20xx cards.

Go MasterZero

unread,

Nov 7, 2018, 8:35:19 AM11/7/18

to LCZero

yes, we use opencl in leela-zero, not cudnn

this has been discussed, including here : https://github.com/gcp/leela-zero/issues/1724

i dont know a lot why myself, but it seems related to a GPL licensing issue

of course, low priority will first consume the free trial credit before actually making you pay (but with pay as you go, you need to manually stop credit consumption by stopping the instances and all ressources !)

Go MasterZero

unread,

Nov 7, 2018, 8:36:58 AM11/7/18

to LCZero

whenever i get the support's replay about the nc6v3 batch quota increase unlock i'll tell you how it goes and how v100 performs

Go MasterZero

unread,

Nov 7, 2018, 9:10:10 AM11/7/18

to LCZero

support's answer :

Hi,

Thank you for the response.

We have engaged the Capacity Management Team to increase the quota.

We will be in constant touch with them to expedite the process. Once we have any update from them, we will keep you posted.

Please feel free to get back to me if you have any questions or concerns.

Best Regards,

so all what's left is to wait and see

Go MasterZero

unread,

Nov 8, 2018, 12:38:41 PM11/8/18

to LCZero

good news ! nc6v3 low priority works now

after a few back and forth email exchanges with the support (my question was misunderstood at first, then unsuccessfully treated, then finally solved) :

nc6v3 works.PNG

i went ahead and asked them how to my my google cloud script work for microsoft azure, see next comment

Go MasterZero

unread,

Nov 8, 2018, 12:39:13 PM11/8/18

to LCZero

HELP needed :

this is a slightly modified version of the script :

/bin/bash -c 'PKG_OK=$(dpkg-query -W --showformat='${Status}\n' glances|grep "install ok installed")
echo Checking for glanceslib: $PKG_OK
if [ "" == "$PKG_OK" ]; then
  echo "No glanceslib. Setting up glanceslib and all other leela-zero packages."
  sudo -i && uname -a && sudo add-apt-repository -y ppa:graphics-drivers/ppa && sudo apt-get update && sudo apt-get -y install nvidia-driver-410 linux-headers-generic nvidia-opencl-dev && sudo apt-get -y install clinfo cmake git libboost-all-dev libopenblas-dev zlib1g-dev build-essential qtbase5-dev qttools5-dev qttools5-dev-tools libboost-dev libboost-program-options-dev opencl-headers ocl-icd-libopencl1 ocl-icd-opencl-dev qt5-default qt5-qmake curl && sudo apt-get -y install glances zip && clinfo && pwd && sudo reboot
else 
  sudo -i && uname -a && clinfo && git clone https://github.com/gcp/leela-zero && cd leela-zero && git submodule update --init --recursive && mkdir build && cd build && cmake .. && cmake --build . && cd ../autogtp && cp ../build/autogtp/autogtp . && cp ../build/leelaz . && ./autogtp -g 2
fi'

what it does is : 
if first boot (glances is not installed) install nvidia driver and glances, then reboot
if 2nd or more boot (glances is installed), install and run autogtp program
question 1 : 
on google cloud it was working flawlessly, but on microsoft azure after the 1st reboot my condition is always false (glances is not detected to be installed), so it always reruns condition 1 and reboots loop endlessly
any idea why it doesnt work like in google cloud ?

question 2 :
does the startup script have a limited max time span, or can it run undefinitely (until node is preempted) ?

question 3 :
when a node is preempted, i didnt find the option to automatically delete it at preemption, like google cloud does (then i would want the batch account to automatically upscale and create a new node)

Go MasterZero

unread,

Nov 13, 2018, 4:12:46 AM11/13/18

to LCZero

update :

after extensive testing, i chose to go for simplicity and manually schedule a reboot, as you can see here :

https://github.com/gcp/leela-zero/issues/1905#issuecomment-438181846

Tesla V100 has indeed slightly faster performance than on google cloud, 7 games in 23 minutes produced (versus arround 5-6 games in google cloud)

however for lczero project, these all in one cloud images may interest you :

data science (includes nvidia cudnn cuda etc pre installed) :

https://azuremarketplace.microsoft.com/en-US/marketplace/apps/microsoft-ads.linux-data-science-vm

ubuntu batch container 16.04 (includes nvidia pre installed) :

https://azuremarketplace.microsoft.com/en-us/marketplace/apps/microsoft-azure-batch.ubuntu-server-container?tab=Overview

Go MasterZero

unread,

Nov 14, 2018, 2:51:02 AM11/14/18

to LCZero

@LuckyDay

@Matt Blakely

This is the performance i get for almost a day :

with low priority cost (0,796$/hour) Tesla V100 :

note : on microsoft azure low priority nodes can last up to 7 days (vs 24 hours for google cloud)

236 games in 1378 minutes on azure :

(a bit slower due to a higher frequency of no resign games, that are much slower)

Which is comparable to what i got on google cloud :

229 games in 1188 minutes

g ssh games 21.PNG

Satisfying so far, now remaings the question of what happens after preemption : how to restart the script easily without losing credit in the idle time if possible

LuckyDay

unread,

Nov 14, 2018, 5:00:10 AM11/14/18

to LCZero

very nice. I will have to try to figure out how to implement this for lc0, in anticipation of test40 coming out. I'm not very technically inclined so having your efforts to refer to is very much appreciated!

Go MasterZero

unread,

Nov 14, 2018, 5:57:55 AM11/14/18

to LCZero

glad to help

similarly, i would strongly appreciate if you or someone here wrote detailed instructions of the first few steps, as i am very lazy to do it

these would be a big help for the instructions i'm planning to write :

steps (if possible with screenshots) :

- how to navigate on microsoft azure portal (as it is very unintuitive)

- how to check credit remaining (billing)

- how to activate pay as you go option

- how to request a quota increase specifically for batch accounts to unlock nc6v3 (see mail i sent previously to microsoft), (help+support requests)

- how to navigate through ressources (all ressources), and delete uneeded ones

- how to create a batch account and navigate inside it (all ressources)

then i'm planning to start my instructions from the point where all these prerequirements are fullfilled, possibly also with a video tutorial (but i wont video record anything past the creation of nc6v3 low priority node)

secondly, i'm currently running test81 and test80 pools in the same batch account (2 low priority nodes with nc6v3), when one or both get preempted (running -> idle) , i will see how to restart it as efficiently and creditless as possible

Go MasterZero

unread,

Nov 14, 2018, 10:03:54 AM11/14/18

to LCZero

great news !

after node preemption, job schedule automatically restarts the script in the recurrence time set (every 30 minutes for me as soon as node is preempted until the script restarts)

full draft instructions are updated here :

https://github.com/gcp/leela-zero/issues/1905#issuecomment-438181846

next runs.PNG