Apache Jelly grammar and parser

127 views
Skip to first unread message

Alceu Rodrigues de Freitas Junior

unread,
Aug 24, 2022, 12:29:09 PM8/24/22
to Jenkins Developers

Greetings,

My name is Alceu and I'm new in this list.

I reached here in the hope that I can get some pointers about writing a parser for Apache Jelly files.

I've being working in a CLI to parse the properties and Jelly files from the Jenkins project, in order to help the translation process from English to other languages.

The project is here: https://github.com/glasswalk3r/jenkins-translation-tool.

The CLI is still using regular expressions for the Jelly parsing and I've struggling to replace that with proper parsers.

I partially reached that by introducing a XML parser and then extracting the Jelly strings.

The thing is, Jelly is complex enough that regular expression are still not good enough for parsing: I keep finding corner cases that are not being covered by an already complex regular expressions.

I decided that is time to stop and try to build a proper grammar to parse Jelly.

Now the problem is that I am not acquainted with the Jelly project itself (https://commons.apache.org/proper/commons-jelly/)

If any of you could give some points about finding where is the Jelly grammar defined, I could try to use Antlr project to generate a grammar that can be used with one of the several grammars distributions available to Perl.

Or, if you have a better suggestion than trying to use a grammar, I'm also open to it.

Thanks in advance,

Alceu

Basil Crow

unread,
Aug 24, 2022, 12:50:20 PM8/24/22
to jenkin...@googlegroups.com
On Wed, Aug 24, 2022 at 9:29 AM Alceu Rodrigues de Freitas Junior
<alceu.fr...@gmail.com> wrote:
> My name is Alceu and I'm new in this list.

Welcome to the list! 👋

> Now the problem is that I am not acquainted with the Jelly project itself (https://commons.apache.org/proper/commons-jelly/)

I am not too acquainted with it either really. The version of Jelly
used in Jenkins is available at:

https://github.com/jenkinsci/jelly

Since Apache Commons Jelly is more or less an abandoned project, the
Jenkins version of Jelly has become the de facto repository of record
in recent years.

> If any of you could give some points about finding where is the Jelly grammar defined, I could try to use Antlr project to generate a grammar that can be used with one of the several grammars distributions available to Perl.

I am not aware of a grammar. The parser is a relatively simple ~1,000
line Java class based on the SAX library:

https://github.com/jenkinsci/jelly/blob/fd230ceb0f98719de625d0bf8e239d0ec133ba9b/src/java/org/apache/commons/jelly/parser/XMLParser.java

Perhaps it might be feasible to invoke this parser directly from a
Java command-line tool or to port it to your language of choice.

Tim Van Holder

unread,
Aug 25, 2022, 3:28:45 PM8/25/22
to jenkin...@googlegroups.com
Hi Alceu,

I would expect that in most cases, the jelly files used by Jenkins and its plugins would already be using localized strings (like "${%Hello World}" which will use the "Hello World" resource from the bundle matching the jelly file's basename.

I'm not sure why you would need any regex parsing on top of the XML processing either (or write a grammar for Jelly when it's really XML); for the standard set of jelly/stapler/... tags, the list of attributes that contain "real" text should be fairly well-known.
And nested text nodes, especially for non-jelly tages, could likely always be considered as localizable text.

Note: perhaps it would be useful for the IDEA plugin to have an inspection that will suggest using "${%Hello World}" instead of just "Hello World" for well-known attributes, like title on f:entry (probably the most common bit of localizable text in a jelly file).
Perhaps even a CodeQL-based thing on GitHub, to identify possible localization issues.


--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jenkinsci-dev/21f6b5d9-60d7-4c8b-a558-4b0136a2f78an%40googlegroups.com.

Daniel Beck

unread,
Aug 26, 2022, 7:28:58 AM8/26/22
to jenkin...@googlegroups.com
On Wed, Aug 24, 2022 at 6:29 PM Alceu Rodrigues de Freitas Junior <alceu.fr...@gmail.com> wrote:
I've being working in a CLI to parse the properties and Jelly files from the Jenkins project, in order to help the translation process from English to other languages.

The project is here: https://github.com/glasswalk3r/jenkins-translation-tool.

The CLI is still using regular expressions for the Jelly parsing and I've struggling to replace that with proper parsers.

I partially reached that by introducing a XML parser and then extracting the Jelly strings.

The thing is, Jelly is complex enough that regular expression are still not good enough for parsing: I keep finding corner cases that are not being covered by an already complex regular expressions.

I decided that is time to stop and try to build a proper grammar to parse Jelly.

Out of curiosity, what are some of the corner cases that cause problems? How common are they?


Daniel Beck

unread,
Aug 26, 2022, 7:30:16 AM8/26/22
to jenkin...@googlegroups.com
On Thu, Aug 25, 2022 at 9:28 PM Tim Van Holder <tim.va...@gmail.com> wrote:
I would expect that in most cases, the jelly files used by Jenkins and its plugins would already be using localized strings (like "${%Hello World}" which will use the "Hello World" resource from the bundle matching the jelly file's basename.

The linked tool is a fork of https://github.com/jenkinsci/jenkins/blob/master/translation-tool.pl which finds those %whatever strings and puts them into .properties files so the people translating don't need to hunt for them in Jelly files.

Alceu Rodrigues de Freitas Junior

unread,
Aug 26, 2022, 10:23:39 AM8/26/22
to jenkin...@googlegroups.com
Hello Basil,

Thanks for the references, I'll be checking them ASAP.

The Jelly part of Apache Commons seems to be an abandoned project indeed (I tried their mailing list, no answers from that).

It would be interesting if I wouldn't need to duplicate the logic in Perl, but I'm not sure how feasible that is. Looking at https://metacpan.org/search?size=20&q=Java there are some promising results like https://metacpan.org/dist/Inline-Java/view/lib/Inline/Java.pod, but I have never used it before how well it works.

Alceu Rodrigues de Freitas Junior

unread,
Aug 26, 2022, 10:42:21 AM8/26/22
to jenkin...@googlegroups.com
Thanks Tim!

That is what I was expecting, but I got some cases that don't follow the format you just described.

Alceu Rodrigues de Freitas Junior

unread,
Aug 26, 2022, 11:04:23 AM8/26/22
to jenkin...@googlegroups.com
I think the best way to describe it is by showing those cases. That should illustrate my answer to Tim.


This is an ad hoc set of files: first I started just copying the files that the parser missed (compared results between this branch parser and the original parser from the main branch), then I started copying entire lines from other Jelly files. You can expect that the XML is OK, but messages might be out of context (and I guess that doesn't matter).

Here is the easiest way to find out where I stopped in the validation processes:

$ git branch
  main
* refactor/jelly_parser
$ prove -lvm t/load_jelly.t
t/load_jelly.t ..
# Using t/samples/message.jelly
ok 1 - result is a hash reference
ok 2 - result has the expected content
# Using sample t/samples/config.jelly
ok 3 - result is a hash reference
ok 4 - result has the expected content
# Using sample t/samples/signup.jelly
ok 5 - result is a hash reference
ok 6 - result has the expected content
# Using sample t/samples/oops.jelly
ok 7 - result is a hash reference
ok 8 - result has the expected content
# Using sample t/samples/manage.jelly
ok 9 - result is a hash reference
not ok 10 - result has the expected content

#   Failed test 'result has the expected content'
#   at t/load_jelly.t line 93.
#     Structures begin differing at:
#          $got->{updateAvailable} = Does not exist
#     $expected->{updateAvailable} = '1'
# {
#   'Manage\\ Jenkins' => 1,
#   'are.you.sure' => 1,
#   'updateAvailable(m.getUpdateCount())\'\\ \\:\\ \'' => 1,
#   'updatesAvailable(m.getUpdateCount())\'' => 1
# }
1..10
# Looks like you failed 1 test of 10.
Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/10 subtests

Test Summary Report
-------------------
t/load_jelly.t (Wstat: 256 (exited 1) Tests: 10 Failed: 1)
  Failed test:  10
  Non-zero exit status: 1
Files=1, Tests=10,  0 wallclock secs ( 0.02 usr  0.00 sys +  0.09 cusr  0.00 csys =  0.11 CPU)
Result: FAIL
$

This is the line causing me trouble:

<div tooltip="${m.getUpdateCount() == 1 ? '%updateAvailable(m.getUpdateCount())' : '%updatesAvailable(m.getUpdateCount())'}" class="jenkins-section__item__icon__badge">

Supposedly the XML parser is getting the tooltip text as expected, but the rest I would need to do an additional effort to parse properly.

Maybe this is a section that I shouldn't bother at all. Maybe a review is needed in all samples available.

The thing is, even with this current failure, the experimental parser is already getting Jelly translation "segments" that are not identified by the original parser (from the main branch).

--
You received this message because you are subscribed to the Google Groups "Jenkins Developers" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jenkinsci-de...@googlegroups.com.

Daniel Beck

unread,
Aug 29, 2022, 4:31:05 AM8/29/22
to jenkin...@googlegroups.com
On Fri, Aug 26, 2022 at 5:04 PM Alceu Rodrigues de Freitas Junior <alceu.fr...@gmail.com> wrote:

Supposedly the XML parser is getting the tooltip text as expected, but the rest I would need to do an additional effort to parse properly.

Denys Digtiar

unread,
Aug 30, 2022, 12:27:06 AM8/30/22
to Jenkins Developers
> Note: perhaps it would be useful for the IDEA plugin to have an inspection that will suggest using "${%Hello World}" instead of just "Hello World" for well-known attributes, like title on f:entry (probably the most common bit of localizable text in a jelly file).

DuMaM

unread,
Aug 30, 2022, 9:45:44 PM8/30/22
to Jenkins Developers
Reply all
Reply to author
Forward
0 new messages