I think the comparison with Arrow is a little unfair. I agree the
Arrow libraries may have had breaking changes in every release.
However, the columnar format has not had a single breaking change
since the 1.0.0 release. I would say the fact that a file written by
pyarrow 0.17.0 is still readable two years later is pretty important
for its success. Parquet, also, has had very few breaking changes
since 1.0.0. This is going to be especially important for Substrait.
Otherwise we are going to end up with a very real challenge of
aligning producers and consumers.
I don't really care about version numbers as much as I care about the
idea of backwards (and ideally forwards) compatibility. At some point
the project has to switch from its current, experimental state, to a
state where we avoid backwards compatibility by making it harder to
make breaking changes. Let's consider a concrete example:
https://github.com/substrait-io/substrait/pull/169
This PR changes the "format" field from an enum to a oneof with a
variety of message types. There is a suggestion to reserve the
"format" field and use a new "file_format" field but that alone
wouldn't give backwards compatibility. To truly have backwards
compatibility we'd have to keep both fields in and have a default
"legacy" option in the newer field which, if set, would tell consumers
to fall back to the old behavior of checking the enum. This way a
message from an older producer would be handled just as correctly as a
message from a newer consumer.
However, this is a lot of extra work on the consumers and I don't
think we are all that close to the point where this kind of work is
justified. I do think we will eventually get to the point where this
sort of thing is important to do.
The "1.0.0" version number is a significant tool to communicate this
shift in backwards compatibility philosophy. But if we think it is
more of a burden for some reason then I don't really care if we
stabilize at 5.0.0 or 20.0.0 or 100.0.0 as long as we agree it will be
important to stabilize at some point.
> To view this discussion on the web visit
https://groups.google.com/d/msgid/substrait/CAJ9XdSpeinA-wr_1Wsqt-12Xuu-mHu9wUSsH0RK7NAPVpKzwtA%40mail.gmail.com.