Command-line SPARQL formatter?

63 views
Skip to first unread message

Boris Pelakh

unread,
Jan 22, 2024, 12:50:47 PMJan 22
to rdflib-dev
My group has quite a few SPARQL queries in our git repo, and I would like to add a pre-commit formatter to standardize appearance and minimize format-related diff noise. Any suggestions? (would also love to have one in VSCode, but that's a secondary concern).

Donny Winston

unread,
Jan 22, 2024, 3:49:37 PMJan 22
to JB
Ooh, I am interested as well. I use edmcouncil/rdf-toolkit (https://github.com/edmcouncil/rdf-toolkit) to accomplish this for turtle (.ttl) files, but I have nothing for sparql files.

On Mon, Jan 22, 2024, at 12:50 PM, Boris Pelakh wrote:
My group has quite a few SPARQL queries in our git repo, and I would like to add a pre-commit formatter to standardize appearance and minimize format-related diff noise. Any suggestions? (would also love to have one in VSCode, but that's a secondary concern).


--
---
You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+...@googlegroups.com.

--
Donny Winston, PhD (he/him/his)
Polyneme LLC
New York, NY

Nicholas Car

unread,
Jan 22, 2024, 4:46:07 PMJan 22
to rdflib-dev
We have just written a stand-alone Python SPARQL parser and serialiser that will probably do what you want:


We will contribute this to the RDFlib set of tools shortly.

It parses any SPARQL to SPARQL grammar and then implements a serialiser to return back out SPARQL strings. This could be made to produce canonical serialisations which would then standardise all your files.

Nick

Wes Turner

unread,
Jan 22, 2024, 4:56:04 PMJan 22
to rdfli...@googlegroups.com

Boris Pelakh

unread,
Jan 23, 2024, 11:53:38 AMJan 23
to rdfli...@googlegroups.com
I like the idea of utilizing the SPARQL parser/serializer, but have a couple of questions re:features

1. would it be able to preserve inline comments? The current Lark grammar discards comments, but keeping them is a requirement.
2. would it be able to remove unused prefix declarations? Obviously that's a nice-to-have

Nicholas Car

unread,
Jan 23, 2024, 3:54:13 PMJan 23
to rdfli...@googlegroups.com
Good questions there Boris that perhaps only Edmond can answer (he should get this and can do so!).

I imagine that both those things, if not already handled, could be added to the code he's written since it parses the query and then uses custom code to serialise. So the prefix removal step would be a straightforward addition in query content analysis before the serialisation. Surely comments just need new elements added to the grammar?

Nick

Edmond Chuc

unread,
Jan 25, 2024, 11:29:47 PMJan 25
to rdflib-dev
Hi Boris,

Yes, we can extend the grammar to preserve comments as a separate parser to the canonical one. Removing unused prefixes can be implemented too. 

I've added both as issues on the GitHub repo https://github.com/Kurrawong/sparql/issues. Feel free to add any other use cases there that you think are useful.

In a couple of weeks, we will be getting a web UI of this online so users can test out and play around with the SPARQL formatter and validator. We will also add a CLI interface to the tool.

Cheers,

Edmond

Etienne Posthumus

unread,
Jan 29, 2024, 12:05:44 PMJan 29
to rdfli...@googlegroups.com
Dear Nicholas and Edmond,

Thanks for sharing your code, this is fab.
I have been using https://tree-sitter.github.io/tree-sitter/ for my SPARQL parsing (and rewriting) needs until now, but that needs some extra work when packaging. (it is not pure-Python)

Really happy to see your parser (and serializer) - the code is easier for me to understand and re-use than the one currently in rdflib. Will see if I can use yours in the future for my query-rewriting needs.

regards

Etienne

Edmond Chuc

unread,
Jan 30, 2024, 1:43:02 AMJan 30
to rdfli...@googlegroups.com
Hi Etienne,

Very happy to hear that it's useful.

We are also using it for query rewriting as our main use case. Will be interested to hear what kind of query rewriting you're doing. Are you mainly performing rewrites for query optimisation?

Cheers,

Edmond

Edmond Chuc

Knowledge Graph Senior Developer

KurrawongAI

   
emailAddress
edm...@kurrawong.ai
website
https://kurrawong.ai
 
 
 


--
http://github.com/RDFLib
---
You received this message because you are subscribed to the Google Groups "rdflib-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to rdflib-dev+...@googlegroups.com.

Etienne Posthumus

unread,
Mar 1, 2024, 6:00:19 AMMar 1
to rdfli...@googlegroups.com
On Tue, 30 Jan 2024 at 07:43, Edmond Chuc <edm...@kurrawong.ai> wrote:
We are also using it for query rewriting as our main use case. Will be interested to hear what kind of query rewriting you're doing. Are you mainly performing rewrites for query optimisation?


Hi Edward, 

Some background info: I am doing query rewriting to add easy fulltext (and other) searches, where the underlying triplestore either does not have it supported, or where it might be too cumbersome to configure/install. See: https://github.com/ISE-FIZKarlsruhe/fizzysearch

This works well, but you need to package up some C-libraries and configure a separate parser at usage/deployment time. So might consider using your work instead as it is more self-contained.

But, all that being said,  the tree-sitter DSL is incredibly useful, see: https://github.com/ISE-FIZKarlsruhe/fizzysearch/blob/main/fizzysearch.py#L26

regards

Etienne
Reply all
Reply to author
Forward
0 new messages