Scientific computation with Tensors

Skip to first unread message

Filipe Duarte

Feb 5, 2021, 3:46:12 PM2/5/21
to Swift for TensorFlow
Are Tensors the right choice to do scientific computation? In my works, I could not use linear algebra operations for everything, and I have to do for loops to iterate in the arrays to compute some algorithms. 

Brad Larson

Feb 5, 2021, 3:59:29 PM2/5/21
to Swift for TensorFlow,
I guess that depends on the computation you'd like to perform. Is it something that you can express as a massively parallel operation, with enough work that it makes sense to dispatch to an accelerator? If so, Tensor makes it relatively easy to run parallel calculations on GPUs or TPUs. If the calculation can be expressed as a graph, and it commonly repeated (like in a loop), then the X10 tracing backend behind our Tensor implementation can give you a further performance boost. Even more, if there are a lot of matrix multiplies and you have the option of running on TPUs.

If the calculation is serial in nature, or has only small operations and limited parallelism, you might be best served by doing that calculation directly on the CPU (or aided by CPU-side SIMD operations). There is some overhead in dispatching data and calculations to an accelerator, so you want to make sure that your calculation can be sped up enough to offset that overhead. Swift can be pretty fast for CPU-side calculations, and can be made faster by using unsafe pointers, etc. where needed.

Filipe Duarte

Feb 5, 2021, 6:13:39 PM2/5/21
to Swift for TensorFlow,, Filipe Duarte

It's a mix of operations, matrix inversion and multiplication, and then serial computations using tensors slices. The algorithms have lots of for loops because it depends on the past state operations.  
So, I could use Tensor to do matrix multiplication, and for the serial loops of the algorithm, what should I do? It's like lots of for loops, and inside them, I do linear algebra's tricks.

Brad Larson

Feb 5, 2021, 6:39:16 PM2/5/21
to Swift for TensorFlow,, Brad Larson
The matrix operations do seem like good candidates for use with accelerator-backed Tensors, although if they're very small and not batched they may still come out ahead with SIMD operations.

If your serial computations can't be parallelized, that might be best handled on the CPU. In that case, we've generally found it to be best to pull down the contents of a Tensor once into a local array (or even an unsafe buffer) and then iterate over that. Repeatedly pulling slices of a Tensor for local calculations can incur a lot of overhead, so it's best to transfer the data once and work on it after that.

Ultimately, you may want to try a few different approaches and profile them to determine where your real bottlenecks are. A lot will come down to the specific calculations you're performing.

Reply all
Reply to author
0 new messages