About recent API changes and high-level APIs

132 views
Skip to first unread message

Richard Wei

unread,
Jun 2, 2018, 9:08:47 PM6/2/18
to Swift for TensorFlow
Hi all,

Earlier this week, we made a few changes to the Swift TensorFlow library.

1. `Tensor.dot` and `⊗` are renamed to `matmul`. (commit)

We deprecated the `⊗` operator and the `Tensor.dot` method for established conventions for matrix multiplication and mathematical correctness. From now on, the API for matrix multiplication is `matmul`, same as Numpy and the TensorFlow Python API.

c = matmul(a, b)

We expect to add a tensor contraction operator which has the same semantics as tf.tensordot, and name it “dot”. Along with the new `dot`, we plan to introduce a bullet operator `•` to represent tensor dot, which can also be easily typed (at least on a Mac, with option+8). This would be a good starter project, if anyone is interested in implementing this.

2. The high-level API prototype is removed. (commit)

Before our open-source release, we coded up a very basic prototype of high-level APIs, placed in HighLevel.swift. The intention was to explore the programming model and discover the challenges brought by the Graph Program Extraction approach to machine learning code, e.g. lack of support for aggregate parameter updates, etc. However, APIs exposed by this file, e.g. FullyConnectedLayer and Optimizer, were nothing but confusion on open source testers, as these APIs were not supposed to be used by anyone. As a result, we completely removed these APIs from the Swift TensorFlow library.

A frequently asked question is whether high-level APIs will be part of the standard library in the compiler codebase. The answer to that is no, because high-level APIs will likely not require any compiler support. Instead, future high-level APIs in Swift will be a separate Swift Package under github.com/tensorflow.

The core team has no plans to design high-level APIs in the short term, because it is very important to get the compiler-enabled building blocks right, including

- core APIs (the TensorFlow module)
- automatic differentiation
- device availability diagnostics
- graph-level device placement
- the constant expression model (compiler-evaluable code)
- Python interoperability
- domain-specific Swift syntax and semantics

When it’s the right time to discuss high-level APIs, the Swift for TensorFlow team will work closely with the open source community, the broader TensorFlow team, and alpha users to come up with the best design.

-Richard

Stephan Hoyer

unread,
Jun 2, 2018, 9:49:26 PM6/2/18
to Richard Wei, Swift for TensorFlow
On Sat, Jun 2, 2018 at 6:08 PM 'Richard Wei' via Swift for TensorFlow <sw...@tensorflow.org> wrote:
We expect to add a tensor contraction operator which has the same semantics as tf.tensordot, and name it “dot”. Along with the new `dot`, we plan to introduce a bullet operator `•` to represent tensor dot, which can also be easily typed (at least on a Mac, with option+8).

In NumPy, matmul (which corresponds to the Python infix operator @) is a much more recent addition than np.dot or np.tensordot -- one that was born of significant experience.

Along with broadcasting (and insertion of dummy dimensions), it's a much more flexible operation than tensordot, because it can also be used to implement batched matrix multiplication. Unless there's something very different about how people do these sort of operations in ML code, I would reconsider changing the infix operator. It might be worth investigating the relative popularity of matmul and tensordot with the Python API, for example.

Richard Wei

unread,
Jun 2, 2018, 10:06:09 PM6/2/18
to Stephan Hoyer, Swift for TensorFlow
Thanks for the insight! Yes, the tensor dot operator is definitely up for debate. Also, `@` isn’t an available operator token in Swift, though we can try to make it one by proposing a language change if there’s a compelling reason.

Renaming the old `Tensor.dot` method to the `matmul` function has been considered positive because the underlying implementation is TF’s MatMul op, and it was semantically different from tf.tensordot. We strive to take established API conventions into account while introducing Swifty APIs.

-Richard

Chris Lattner

unread,
Jun 2, 2018, 10:40:19 PM6/2/18
to Richard Wei, Stephan Hoyer, Swift for TensorFlow

On Jun 2, 2018, at 7:06 PM, 'Richard Wei' via Swift for TensorFlow <sw...@tensorflow.org> wrote:

Thanks for the insight! Yes, the tensor dot operator is definitely up for debate. Also, `@` isn’t an available operator token in Swift, though we can try to make it one by proposing a language change if there’s a compelling reason.

I see no reason to use @ as the operator even if it were allowed in Swift.  We should use something that is communicative, we don’t need compatibility with Python’s decision.

-Chris


Renaming the old `Tensor.dot` method to the `matmul` function has been considered positive because the underlying implementation is TF’s MatMul op, and it was semantically different from tf.tensordot. We strive to take established API conventions into account while introducing Swifty APIs.

-Richard

On Jun 2, 2018, at 6:49 PM, Stephan Hoyer <sho...@google.com> wrote:

On Sat, Jun 2, 2018 at 6:08 PM 'Richard Wei' via Swift for TensorFlow <sw...@tensorflow.org> wrote:
We expect to add a tensor contraction operator which has the same semantics as tf.tensordot, and name it “dot”. Along with the new `dot`, we plan to introduce a bullet operator `•` to represent tensor dot, which can also be easily typed (at least on a Mac, with option+8).

In NumPy, matmul (which corresponds to the Python infix operator @) is a much more recent addition than np.dot or np.tensordot -- one that was born of significant experience.

Along with broadcasting (and insertion of dummy dimensions), it's a much more flexible operation than tensordot, because it can also be used to implement batched matrix multiplication. Unless there's something very different about how people do these sort of operations in ML code, I would reconsider changing the infix operator. It might be worth investigating the relative popularity of matmul and tensordot with the Python API, for example.


--
You received this message because you are subscribed to the Google Groups "Swift for TensorFlow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to swift+un...@tensorflow.org.

Ray Fix

unread,
Jun 5, 2018, 11:25:44 PM6/5/18
to Chris Lattner, 'Chris Lattner' via Swift for TensorFlow, Richard Wei, Stephan Hoyer
[Pardon my ignorance, in advanced.]  

Is this operation akin to %*% in R?  

R, of course, does all kinds of crazy broadcasting, even more so than numpy, IIUC, because it turns out to be pretty convenient for interactive. Brevity is sometimes prioritized over clarity, unlike Swift’s goals.  With Swift, I imagine solving this with a lazy Broadcast container type.  Something like:

c =  a %*% b.broadcasted

Maybe “broadcast” is not the right term to used though.  Also, maybe %*% resembles line noise too much, but it does work in Swift. :]

Ray

Richard Wei

unread,
Jun 5, 2018, 11:38:22 PM6/5/18
to Ray Fix, Chris Lattner, 'Chris Lattner' via Swift for TensorFlow, Stephan Hoyer
Currently we want to maintain the semantics of common TensorFlow ops, like arithmetics. All of them implicitly broadcast and there’s no way to turn it off unless we reimplement kernels for these ops. Explicit broadcasting does have benefits and can make code less error-prone, but re-evaluating existing TensorFlow ops isn’t a high priority for today, and certain level of compatibility with existing TensorFlow semantics is important for initial adoption of Swift.

-Richard

Ray Fix

unread,
Jun 5, 2018, 11:43:23 PM6/5/18
to Richard Wei, Chris Lattner, 'Chris Lattner' via Swift for TensorFlow, Stephan Hoyer
Got it.  Thanks for the explanation.

Ray
Reply all
Reply to author
Forward
0 new messages