On Thu, Jul 11, 2013 at 3:38 AM, Nils Barth <
nba...@chromium.org> wrote:
> Hi Luke,
> Thanks so much for the info!
no problem - sorry for taking a while to get back to you.
> I don't think we can effectively reuse your code
heck it's not even mine! i just took the best-of-breed proven code
from two disparate projects and spliced them together :)
>(or even could have had we
> know about it earlier), and we in fact already have done most of the work,
> but your description of what you're doing is very useful.
> (Perhaps WebKit could use your parser or code generator approach for all
> bindings?)
yehhh, about that: as many people will know already, the efforts in i
think it was 2009 to add gobject bindings using the existing
perl-based code generator illustrated very graphically why it is that
the perl-based codegenerator needs to have its arms and legs ripped
off, flung to the furthest corners of the earth and the rest buried
deep at the bottom of the mariana trench.
i don't know if you know the background but the discussion got to
*three hundred and fifty* separate messages on the one bugreport, and
at one time i had over 650 simultaneous vi sessions open - so many
were open in one xterm that i had to do e.g. "jobs | grep Node" in
order to find them.
the complexity was so overwhelming that it caused the webkit
developers to be unable to cope: i simply couldn't explain to them
what the decisions were, and i certainly didn't have time to go back
and "redo" things step-by-step in a way that would allow them to
"follow along" so to speak, as the IDL files are, as you've no doubt
already discovered, horribly, horribly inter-linked.
the bottom line is: they blamed me for the mess, and rather than fix
the problems, one of the developers began what can only be described
as a very carefully arranged vendetta which gave him perfect
justification to ban me from webkit's mailing list and bugtracker.
answer: they could... but they're so embarrassed by their behaviour
that the chances are quite remote. i've done the best i can by
assigning the copyright of the code to the FSF but they'll have to
work things out for themselves.
eric seidel whom i respect enormously unfortunately got caught up in
the cross-fire.
but... anyway, an interesting lesson that i'm glad to see that you're
avoiding by doing a decent redesign.
with that in mind - one thing to consider: given the background and
the stress caused by the existing perl-based design, do you *really*
want to keep it around, even by copying it accidentally by following a
conversion path?
> (Outline of what we're doing below; for updates follow:
> Issue 239771: Rewrite the IDL compiler in Python.)
>
> We're in fact taking a very similar approach: we're also using PLY (Python
> Lex-Yacc) for the parser/frontend,
ply's pretty awesome. but the funny thing is i didn't choose it: the
decision was made by the mozilla foundation when they did xpidl all
those years ago.
> and a template engine, in our case Jinja,
> for the code generator/backend.
if it's got the ability to do pre, middle and post function
generation, as well as return types, pre- and post- code-generation on
argument types as distinct from being used as return types, then you
have everything you need here.
the pre- and post- code-generation as associated with variable
"types" is rather important as you need to do type-conversion prior to
using an argument and then clean-up afterwards.
that's what the argtypes.py stuff from gobject's codegen is all
about. one class per "type" - int, float, object, date and so on -
and each "type" takes care of itself.
that then made the main "module" generation - where the functions and
property codegeneration etc. are created - much easier to do.
makes for vastly simpler and manageable code. i really didn't have
to do a vast amount of testing. churned out the right pre/post stuff
for each argtype one by one, then ran it.
testing-wise this stuff's so critically dependent on getting it right
that bugs were, to be honest, blindingly obvious :) miss out even a
single refcount on a pre-gen and you *really* know about it, instantly
:) miss out a single decref on a post-gen and again the sheer
overwhelming amount of memory lost again is blindingly obvious.
> These libraries make the rewrite clean and straightforward, but actual
> implementation takes lots of detailed work, similar to what you saw:
... detailed and *very* intense concentration in order to maintain
waaay above-average number of different tasks/issues/languages in your
brain at once...
... i sympathise :)
> * complexity of existing V8 C++ bindings
ahh, it's not so baaaad :) really!
> (main complexity, language-specific);
>
> * reading AST
... here i avoided that effort entirely, by making an
already-existing tried-and-tested and proven codebase comprehend
webkit IDL.
> (converting to Python objects as Intermediate Representation;
> straightforward but long, especially as we want to maintain exact
> compatibility with Perl during the transition);
[ yehh, i invite you to reconsider that decision in light of the
legacy that its design will carry forward, namely: the strain that the
design or lack of placed onto anyone who got involved with it. ]
again, here's where i avoided that effort entirely, by not only
dropping that legacy but also just... re-using that proven codebase.
i think the only changes i needed to do were to make it comprehend the
webkit type-qualifiers (the ones in square brackets).
> * build changes
>
> (GYP rather than make);
oo fuuun :)
> * reusing existing Chromium code
>
> (see below).
>
>
> In terms of code reuse, rather than copy-pasting and modifying existing
> programs, we're using existing Chromium code as much as possible. We
> currently have 4 IDL parsers + code generators: Blink, Chromium bindings,
> Dart, and Pepper – actually 6 if you include old/new Blink and Pepper – and
> we'd like this to be 1, both to avoid code duplication and for
> maintainability.
mmmm.... my guess is that you'll end up getting bitten in the ass if
you try that :)
oh wait.... someone's already added a *completely separate* blink IDL
parser+generator (as separate and distinct from the perl-based
codegenerator one)?
if so, maybe people have learned from that absolute nightmare attempt
to add gobject bindings, after all.
but... yeah, if you're looking to do common code-generation with
different front-ends for each language, and the codebase reflects that
through providing infrastructure and then front-end language-specific
generators, then yeah that's a great idea.
> There already is a PLY-based IDL parser in the Chromium trunk, namely the
> Pepper parser version 2 at src/tools/idl_parser, so for code reuse, the new
> Blink IDL parser derives from this base one. Ideally they'd in fact be
> merged, but we probably need a few compatibility tweaks. Thus we get the
> same result as using (py?)xpidl (a Python Lex-Yacc based parser), without
> the code duplication.
it's the AST that ideally needs to be the same in each. gosh there
appears to have been quite a lot of disparate efforts and a lot of
non-communication going on. hmmm....
> As haraken noted, we've already made the build changes and the parser is
> complete locally, so we should land that in the next week, and then
> incrementally land the template-based code generator.
... bear in mind that the main reason i had such an enormous amount
of difficulty with the webkit team was that this stuff was flat-out
impossible to verify or even code up in anything remotely approaching
an "incremental" fashion. it was so inter-dependent that it was very
much an all-or-nothing deal, and verification very much had to be done
at the higher levels. i.e. as it's so low-level, through the sheer
overwhelming amount of times the code gets called, any problems
*instantly* showed up, and so even doing unit tests turned out to be
completely unnecessary.
i mention this because if you are deploying the exact same
ultra-strict code-review procedures that the webkit developers are
deploying, you're going to run into problems.
just... something to watch out for that because of the critical
inter-dependence of the IDL files, and that that inter-dependence gets
naturally reflected into the code, at some point you may end up with a
rather large patch that brings everything together that simply can't
be split into smaller chunks in any meaningful way, and you may end up
having to take a leap of faith at a top level that it just.... "works"
:)
> Thanks again for the note!
no problem sah.
/peace
l.