Groups keyboard shortcuts have been updated
Dismiss
See shortcuts

New place for Substrait MLIR dialect

57 views
Skip to first unread message

Ingo Müller

unread,
Aug 23, 2024, 4:08:44 AM8/23/24
to substrait
Dear Substrait community,

It was great to get the chance to share my work and get your input on the Substrait MLIR dialect in the community meeting last month!

I would now like to follow up on one thread of the discussion, namely that on finding the next place to live for the current prototype. The current place isn't very meaningful but the result of historic, pragmatic decisions. If it could live in some more Substrait-related place, I think it would make its purpose clearer and give it more visibility.

I have briefly chatted with Weston. Here are the possibilities I took from that conversation:
  • It could live in a new repository inside the substrait-io GH organization, say, "substrait-mlir." However, since I am not a Substrait committer, I'd need all PRs to be reviewed, which could stall development. (It'd be great to get to that eventually, though!)
  • It could live in a to-be-created GH organization, say, "substrait-contrib," which would host third-party projects related to Substrait that don't have the required level of maturity/support from committers to live in the main organization. (As an example, datafusion has such an organization.) Depending on the nature and development of the projects here, they could eventually be promoted.
Both options would achieve my current goal. I would be interested to hear what people say.

All the best,
Ingo

Jacques Nadeau

unread,
Aug 26, 2024, 12:57:27 AM8/26/24
to Substrait
I think both ideas make sense. We've been doing mostly the former thus far but I think the contrib pattern has merits. 

What kind of use have you seen so far wrt mlir/substrait integration? 

--
You received this message because you are subscribed to the Google Groups "substrait" group.
To unsubscribe from this group and stop receiving emails from it, send an email to substrait+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/substrait/c9f7fe74-43e2-43c0-90c6-7100c0b18906n%40googlegroups.com.

Ingo Müller

unread,
Aug 26, 2024, 3:29:44 AM8/26/24
to subs...@googlegroups.com
Cool, thanks for the input!

No usage yet. The prototype doesn't have a very broad coverage yet but I'll get to supporting the first TPC-H query soon, I think, and then work towards full TPC-H coverage from there.

One goal of moving to a more official place is to make (experimental) usage and external contributions easier. There are at least two research prototypes in the DB space that already use MLIR (LingoDB and Daphne), both of which are candidates of early adopters plus I have other connections to academia to people who showed interest.

All the best,
Ingo

Carlo Aldo Curino

unread,
Aug 26, 2024, 3:30:41 PM8/26/24
to subs...@googlegroups.com
+1 on the “interest” side from me/MS. We are using Substrait internally and I see value in what is proposed here and would like to see this growing up to be part of the infra/integrations available for Substrait.

Thanks,
Carlo

Weston Pace

unread,
Aug 26, 2024, 5:28:53 PM8/26/24
to subs...@googlegroups.com
I think it's good to have a place where these projects can live.  First, it helps to avoid future governance issues.  With both approaches the repo is clearly "owned" by Substrait, whereas a personal repo could yank the code / delete the repo / etc. at some point.  This helps build collaboration on new projects, potentially attracting and generating new committers.  It helps with the "discoverability problem", e.g. users that are interested in Substrait can quickly find related projects.  It also helps demonstrate activity and use of a project.  I'm not aware of the downsides but I'll ask around to see if others have had problems with this approach.

I'm fine with both approaches but, if we adopt the first approach (place it in substrait-io), then I think we need an update on the governance docs which currently state:

> The Substrait project consists of the code and repositories that reside in the substrait-io GitHub organization
> [...]
> Non-breaking extension additions & non-format code modifications [require 1 committer (other than the proposer)]

These requirements make it difficult for a contrib project (which doesn't necessarily have a committer) to move forward.  So either we go with the second approach or we update the governance docs to specify special requirements for contrib projects.

With that in mind I lean somewhat towards the second approach (separate org).

Ingo Müller

unread,
Sep 13, 2024, 7:40:15 AM9/13/24
to subs...@googlegroups.com
Hi everyone,

There has been some discussion on the topic in the community meeting of August 28, which I, unfortunately, missed. The notes seem to suggest repos in the main Github org with a `-contrib` suffix or similar would have the main benefit of better discoverability and that the guidelines could be changed to have relaxed requirements for these projects. Is that the conclusion of the discussion?

If so, what are the next steps? If people agree, I could have a first stab at formulating a proposal just to have something concrete to talk about. What do people think?

All the best,
Ingo

Weston Pace

unread,
Sep 13, 2024, 9:04:39 AM9/13/24
to subs...@googlegroups.com
Yes, that matches my recollection.  Please feel free to go ahead and create some kind of proposal.  Just a small paragraph added to `site/docs/governance.md` is probably fine.  Something like "repositories ending in -contrib are community projects related to Substrait but are not part of official Substrait releases and not subject to the same level of scrutiny".

Ingo Müller

unread,
Sep 13, 2024, 9:33:47 AM9/13/24
to subs...@googlegroups.com
Sounds good! Here is a first draft: https://github.com/substrait-io/substrait/pull/707. See the rationale in the commit message/description of the PR. Let me know what you think.

Cheers,
Ingo

Ingo Müller

unread,
Sep 26, 2024, 3:20:22 AM9/26/24
to subs...@googlegroups.com
Hey everybody,

We discussed the last revision of the changes in #707 in the community meeting yesterday (notes). There seems to have been consensus that the change fits the requirements of the current needs of the community and that the current revision of the PR addresses all concerns that were previously raised.

I thus offered myself to call for an official vote by the SMC, which is required to have the changes adopted, and which I hereby do :)

Thanks again everybody for the support and suggestions!

All the best,
Ingo
Reply all
Reply to author
Forward
0 new messages