Announcing substrait-explain: A New Text-Based Format for Substrait Plans

19 views
Skip to first unread message

Wendell Smith

unread,
Jul 8, 2025, 4:16:29 PMJul 8
to substrait
Dear Substrait Community,

I'm excited to announce substrait-explain, a new project providing a human-readable, SQL-EXPLAIN-like format for Substrait plans. This approach has been particularly useful internally at Datadog for daily development and debugging, especially for logging production queries and testing.

Here’s an example of the format:

=== Extensions
URIs:
  @  1: https://github.com/substrait-io/substrait/blob/main/extensions/functions_comparison.yaml
  @  2: https://github.com/substrait-io/substrait/blob/main/extensions/functions_arithmetic.yaml
  @  3: https://github.com/substrait-io/substrait/blob/main/extensions/functions_aggregate.yaml

Functions:
  # 10 @  1: eq
  # 11 @  1: gt
  # 12 @  2: multiply
  # 13 @  3: sum

=== Plan
Root[customer_revenue]
  Aggregate[$0, $1 => $0, $1, sum($3)]
    Filter[gt($3, 100) => $0, $1, $2, $3]
      Project[$0, $1, $2, multiply($4, $5)]
        Join[&Inner, eq($0, $3) => $0, $1, $2, $3, $4, $5]
          Read[users => id:i64, name:string, region:string]
          Read[orders => user_id:i64, quantity:i32, price:i64]


While the Substrait protobuf format is robust for programmatic use, its JSON form can be challenging for humans to digest. substrait-explain aims to provide that clear, "EXPLAIN"-like view for understanding plans at a glance. It's a two-way tool: capable of both outputting this text format from a Substrait plan and parsing the text back into a plan.

This project is in active development, currently supporting basic relations with more coverage planned. I hope the community finds it useful! While we’re happy for this project to remain under the Datadog umbrella, we'd certainly be open to discussing adoption by substrait-io in the future if interest develops. Your feedback and contributions are very welcome, especially as its coverage and capabilities evolve – I’d love to hear your thoughts on how this might be more useful!

Best regards,

Wendell Smith
Datadog Substrait Team

Jacques Nadeau

unread,
Jul 8, 2025, 4:23:49 PMJul 8
to subs...@googlegroups.com
Super cool and nice to see. As a db person, I also appreciate the root to tree format.

--
You received this message because you are subscribed to the Google Groups "substrait" group.
To unsubscribe from this group and stop receiving emails from it, send an email to substrait+...@googlegroups.com.
To view this discussion visit https://groups.google.com/d/msgid/substrait/8ec6db39-da02-4d39-9593-0e6251c7629dn%40googlegroups.com.

Aldrin

unread,
Jul 8, 2025, 5:10:03 PMJul 8
to subs...@googlegroups.com
Hi Wendell!

Thanks for sharing!

I especially like how the Join​ and Project​ information is shared:
  • Join[&Inner, eq($0, $3) => $0, $1, $2, $3, $4, $5]

Assuming attributes from Read​ are collected in order, it feels fairly intuitive for me to read what the Join​ is doing and how the input attributes are propagated.


--
publickey - octalene.dev@pm.me - 0x21969656.asc
signature.asc
Message has been deleted

Matthew White

unread,
Jul 15, 2025, 3:40:58 PMJul 15
to substrait
Hello; A text format that is useful for debugging is very useful so completely support this idea. Good to have something to get an overview of the plan. 

Just to say there is a `SubstraitStringify` in the examples in SubstraitJava - worth considering how the two are positioned. 
To add though that  I wrote that one but not precious :-)  maybe just run it see if there's anything that can be used as inspiration. 

Thanks Matthew

Reply all
Reply to author
Forward
0 new messages