How to train on multiple TPU cores?

58 views
Skip to first unread message

Rahul

unread,
Feb 10, 2021, 7:26:20 AM2/10/21
to Swift for TensorFlow
Hi everyone!

Is there any way to train the model on all 8 cores of TPU? The example given in this Colab notebook only shows how to train model on 1 TPU core. 

Regards,
Rahul Bhalley

tristan...@googlemail.com

unread,
Feb 10, 2021, 7:43:25 AM2/10/21
to Rahul, Swift for TensorFlow
Hi Rahul,

Why would you need to train on 8 cores? 
Are you aware of the energy resources for doing so?

Von meinem iPhone gesendet

Am 10.02.2021 um 13:26 schrieb Rahul <rahul...@gmail.com>:

Hi everyone!

Is there any way to train the model on all 8 cores of TPU? The example given in this Colab notebook only shows how to train model on 1 TPU core. 

Regards,
Rahul Bhalley

--
To unsubscribe from this group and stop receiving emails from it, send an email to swift+un...@tensorflow.org.

Rahul

unread,
Feb 13, 2021, 1:54:22 AM2/13/21
to Swift for TensorFlow, Swift for TensorFlow, Rahul
Hi S4TF team,

I found some documentation on training model on multiple TPU cores here. But I'm unable to get it to work. 

I'm trying to initialize ThreadState as follows: 

var threadState = ThreadState(model: model, optimizer: optimizer, id: 0, devices: tpuDevices, useAutomaticMixedPrecision: false)

where

var tpuDevices = [Device]()
for device in Device.allDevices {
  if device.kind == .TPU {
    tpuDevices.append(device)
  }
}

But it gives the following error:

error: Couldn't lookup symbols: x10_training_loop.ThreadState.init(model: τ_0_0, optimizer: τ_0_1, id: Swift.Int, devices: Swift.Array<TensorFlow.Device>, useAutomaticMixedPrecision: Swift.Bool) -> x10_training_loop.ThreadState<τ_0_0, τ_0_1> x10_training_loop.ThreadState.init(model: τ_0_0, optimizer: τ_0_1, id: Swift.Int, devices: Swift.Array<TensorFlow.Device>, useAutomaticMixedPrecision: Swift.Bool) -> x10_training_loop.ThreadState<τ_0_0, τ_0_1>

Moreover, it's unclear how to set the thread id argument (looks like that's what TensorFlow would manage automatically behind-the-scenes given all the devices (i.e. TPU cores) to train the model on). 

Looking towards some help.

Regards
Rahul Bhalley

Message has been deleted

Brennan Saeta

unread,
Feb 13, 2021, 6:23:15 PM2/13/21
to Rahul, Swift for TensorFlow
Hi Rahul!

Just giving you a quick response: to drive multiple devices you should start multiple threads (one per device). Be sure to number each thread, and then pass that as the ID to the ThreadState initializer, as that's used to identify which device to drive.

Some example code (assumes a `func parallelMap` extension on collection that runs with the specified number of threads):

let combinedLoss = (0..<devices.count).parallelMap(nThreads: devices.count) {
(threadId: Int) -> [Float] in
return state[threadId].run(
train: dataset.training,
test: dataset.test,
crossReplicaSumDevices: crsDevices,
scheduleLearningRate: { opt in
opt.learningRate = lrFinderScheduleLearningRate(opt.step + 1)
},
lossFunction: { ŷ, y in
labelSmoothingCrossEntropy(logits: ŷ, labels: y, alpha: 0.1)
},
maxIterations: maxIterations - allLoss.count
)
}.map { $0 }[0]

As for the error you're encountering about not being able to find the initializer, I don't quite know what might cause that off the top of my head. Sorry! Maybe someone else might know?

Hope that helps!

All the best,
-Brennan

Reply all
Reply to author
Forward
Message has been deleted
0 new messages