Design questions: how to extract the list of C++ methods implementing a given IDL

38 views
Skip to first unread message

luk...@chromium.org

unread,
Jan 9, 2017, 11:43:39 PM1/9/17
to blink-reviews-bindings
Hello,

The consensus for the Big Blink Rename is to keep camelCased method names for C++ methods that implement attributes and methods declared via IDL (https://crbug.com/673039#c37). To skip renaming such methods in the rewrite_to_chrome_style clang tool, we are considering generating a list of all such methods beforehand and then feeding that list to the clang tool. For example, the following C++ methods would be on the generated list, because they implement attributes and methods from Document.idl (this list is far from complete):
- blink::Document::origin (readonly attribute from Document.idl)
- blink::Document::createDocumentFragment (method from Document.idl)
- blink::Document::domain and blink::Document::setDomain (read-write attribute from Document.idl)
- blink::Location::href and blink::location::setHref (attribute from Document.idl that uses PutForwards)
- blink::ContainerNode::getElementsByTagName (method from Document.idl, implemented in base class of blink::Document; possibly the clang tool can figure out the inheritance when processing Document.cpp and apply that knowledge when later processing ContainerNode.cpp)

I tried looking at third_party/WebKit/Source/bindings/scripts earlier today and I see IdlReader in idl_reader.py and Visitor in idl_definitions.py. OTOH, I am not sure if there is a centralized code for mapping IDL AST nodes into C++ methods - the logic behind this kind of mapping seems to be scattered in various third_party/WebKit/Source/bindings/scripts/v8_*.py files.

Do you think generating the list of methods described above is a reasonable plan? What would be the best way to generate such list?

Do you think a standalone tool working with IdlReader and (IDL AST) Visitor classes can generate the method list described above? Or should we consider other approaches (maybe emitting the list from within v8_*.py scripts and/or Jinja templates? or maybe by scraping the generated files for impl->foo calls (possibly doable with a new/separate clang tool)?).

Ideally we want to generate the list above with 100% accuracy, but... do any of the answers above change if skipping renaming of a method needs to only be 95% accurate? (we can always evaluate the cost and trade-offs of fixing the remaining 5% methods manually).

Do you have any general advise to offer here? :-)


Thanks,

Lukasz

Kentaro Hara

unread,
Jan 10, 2017, 12:17:46 AM1/10/17
to Łukasz Anforowicz, platform-architecture-dev, Kenichi Ishibashi (via Google Docs), Yuki Shiino, blink-reviews-bindings
On Tue, Jan 10, 2017 at 1:43 PM, <luk...@chromium.org> wrote:
Hello,

The consensus for the Big Blink Rename is to keep camelCased method names for C++ methods that implement attributes and methods declared via IDL (https://crbug.com/673039#c37).  To skip renaming such methods in the rewrite_to_chrome_style clang tool, we are considering generating a list of all such methods beforehand and then feeding that list to the clang tool.  For example, the following C++ methods would be on the generated list, because they implement attributes and methods from Document.idl (this list is far from complete):
- blink::Document::origin (readonly attribute from Document.idl)
- blink::Document::createDocumentFragment (method from Document.idl)
- blink::Document::domain and blink::Document::setDomain (read-write attribute from Document.idl)
- blink::Location::href and blink::location::setHref (attribute from Document.idl that uses PutForwards)
- blink::ContainerNode::getElementsByTagName (method from Document.idl, implemented in base class of blink::Document;  possibly the clang tool can figure out the inheritance when processing Document.cpp and apply that knowledge when later processing ContainerNode.cpp)

I tried looking at third_party/WebKit/Source/bindings/scripts earlier today and I see IdlReader in idl_reader.py and Visitor in idl_definitions.py.  OTOH, I am not sure if there is a centralized code for mapping IDL AST nodes into C++ methods - the logic behind this kind of mapping seems to be scattered in various third_party/WebKit/Source/bindings/scripts/v8_*.py files.

If we go with the approach, does it mean that the clang plugin needs to invoke the IDL compiler and process all IDL files at each compilation? If the answer is yes, it would be too heavy.

One thing we could do is to make the IDL compiler generate a list of C++ methods in a dedicated file so that the file is updated only when there's any update in IDL files. Then the clang plugin can use the file to understand the list of C++ methods.
 

Do you think generating the list of methods described above is a reasonable plan?  What would be the best way to generate such list?

Do you think a standalone tool working with IdlReader and (IDL AST) Visitor classes can generate the method list described above?  Or should we consider other approaches (maybe emitting the list from within v8_*.py scripts and/or Jinja templates?  or maybe by scraping the generated files for impl->foo calls (possibly doable with a new/separate clang tool)?).

Ideally we want to generate the list above with 100% accuracy, but... do any of the answers above change if skipping renaming of a method needs to only be 95% accurate?  (we can always evaluate the cost and trade-offs of fixing the remaining 5%  methods manually).

Do you have any general advise to offer here? :-)


Thanks,

Lukasz




--
Kentaro Hara, Tokyo, Japan

luk...@chromium.org

unread,
Jan 10, 2017, 12:59:26 AM1/10/17
to blink-reviews-bindings, luk...@chromium.org, platform-arc...@chromium.org, ba...@chromium.org, yukis...@chromium.org
On Monday, January 9, 2017 at 9:17:46 PM UTC-8, Kentaro Hara wrote:
> If we go with the approach, does it mean that the clang plugin needs to invoke the IDL compiler and process all IDL files at each compilation? If the answer is yes, it would be too heavy.

I think the answer is "no". We would invoke the IDL parsing tool (*) only once, saving the generated list of C++ methods into a file. This file would then be repeatedly fed as an extra argument/input into the rewrite_to_chrome_style clang tool (as it processes all the C++ compilation units in Chromium).

(*) whatever "IDL parsing tool" is :-) - I am assuming this will be a new tool based on the code used by the IDL compiler, but I wanted to discuss if this is a good idea and/or if we should consider other approaches (for generating the list of C++ methods that should be blacklisted from the Great Blink Rename; or maybe even other blacklisting approaches that aren't necessarily based on such pregenerated list).


-Lukasz

Yuki Shiino

unread,
Jan 10, 2017, 1:03:02 AM1/10/17
to Łukasz Anforowicz, blink-reviews-bindings, platform-architecture-dev, Kenichi Ishibashi (via Google Docs), Yuki Shiino
Ignorant question: Why do we want to make difference between C++ methods for IDL definitions and C++ methods for other uses?  Why don't we simply change all C++ functions into FooBarBaz style, including IDL defined functions?  To me, it looks easier to change the IDL compiler to generate FooBarBaz style than generating the method list.

Cheers,
Yuki Shiino



--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsub...@chromium.org.
To post to this group, send email to platform-architecture-dev@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/platform-architecture-dev/b842421e-789d-41a6-9de5-1031f0c04c27%40chromium.org.

Kentaro Hara

unread,
Jan 10, 2017, 1:08:19 AM1/10/17
to Łukasz Anforowicz, blink-reviews-bindings, platform-architecture-dev, Kenichi Ishibashi (via Google Docs), Yuki Shiino
On Tue, Jan 10, 2017 at 2:59 PM, <luk...@chromium.org> wrote:
On Monday, January 9, 2017 at 9:17:46 PM UTC-8, Kentaro Hara wrote:
> If we go with the approach, does it mean that the clang plugin needs to invoke the IDL compiler and process all IDL files at each compilation? If the answer is yes, it would be too heavy.

I think the answer is "no".  We would invoke the IDL parsing tool (*) only once, saving the generated list of C++ methods into a file.  This file would then be repeatedly fed as an extra argument/input into the rewrite_to_chrome_style clang tool (as it processes all the C++ compilation units in Chromium).

I think this will work.

 

(*) whatever "IDL parsing tool" is :-) - I am assuming this will be a new tool based on the code used by the IDL compiler, but I wanted to discuss if this is a good idea and/or if we should consider other approaches (for generating the list of C++ methods that should be blacklisted from the Great Blink Rename;  or maybe even other blacklisting approaches that aren't necessarily based on such pregenerated list).


-Lukasz
--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsub...@chromium.org.
To post to this group, send email to platform-architecture-dev@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/platform-architecture-dev/b842421e-789d-41a6-9de5-1031f0c04c27%40chromium.org.

Daniel Cheng

unread,
Jan 10, 2017, 1:10:45 AM1/10/17
to Yuki Shiino, Łukasz Anforowicz, blink-reviews-bindings, platform-architecture-dev, Kenichi Ishibashi (via Google Docs)
We ended up choosing this approach for several reasons:

- It makes it clear which C++ methods are exposed via WebIDL. There's been a longstanding desire to write more unit tests rather than layout tests, and making it obvious what is and what isn't exposed is helpful for this goal.
- It reduces the decision tree for naming, since we will have a consistent naming scheme. With lowerCamelCase, we don't need to worry about potential naming collisions, and so the mapping of IDL names to C++ implementation names for developers reading the code is more straightforward.
- It may also allow us to get rid of conventions like suffixing functions with "ForBinding", e.g. currentScriptForBinding() could be renamed to currentScript().

Daniel

On Mon, Jan 9, 2017 at 10:03 PM Yuki Shiino <yukis...@chromium.org> wrote:
Ignorant question: Why do we want to make difference between C++ methods for IDL definitions and C++ methods for other uses?  Why don't we simply change all C++ functions into FooBarBaz style, including IDL defined functions?  To me, it looks easier to change the IDL compiler to generate FooBarBaz style than generating the method list.

Cheers,
Yuki Shiino

2017-01-10 14:59 GMT+09:00 <luk...@chromium.org>:
On Monday, January 9, 2017 at 9:17:46 PM UTC-8, Kentaro Hara wrote:
> If we go with the approach, does it mean that the clang plugin needs to invoke the IDL compiler and process all IDL files at each compilation? If the answer is yes, it would be too heavy.

I think the answer is "no".  We would invoke the IDL parsing tool (*) only once, saving the generated list of C++ methods into a file.  This file would then be repeatedly fed as an extra argument/input into the rewrite_to_chrome_style clang tool (as it processes all the C++ compilation units in Chromium).

(*) whatever "IDL parsing tool" is :-) - I am assuming this will be a new tool based on the code used by the IDL compiler, but I wanted to discuss if this is a good idea and/or if we should consider other approaches (for generating the list of C++ methods that should be blacklisted from the Great Blink Rename;  or maybe even other blacklisting approaches that aren't necessarily based on such pregenerated list).


-Lukasz

--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architect...@chromium.org.
To post to this group, send email to platform-arc...@chromium.org.

--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architect...@chromium.org.
To post to this group, send email to platform-arc...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/platform-architecture-dev/CAN0uC_TYL1ptfEenuSvKgsqbNp31_Y1n3ma%2B-LME9SfByfwc0w%40mail.gmail.com.

Kenichi Ishibashi

unread,
Jan 10, 2017, 1:28:58 AM1/10/17
to Kentaro Hara, Łukasz Anforowicz, platform-architecture-dev, Yuki Shiino, blink-reviews-bindings
Unfortunately the code generator isn't flexible enough for your use case :( I can't offer approaches other than you've already proposed and each approach has different cons.

- A standalone tool which uses IdlReader and Visitor: Difficult to make it 100% accuracy as we have to consider ImplementedAs, PutForwards, and partial interfaces.
- Emitting the list using Jinja templates: This is similar to webmodules. We have to implement Jinja templates and scripts that creates the context for Jinja templates.
- Scraping generated code: Maybe it's difficult to identify where we call impl->foo().

If 95% accuracy is acceptable, a standalone tool would be a reasonable plan. I have a script which parses IDL files and generates JSON. It might be useful to write a tool to generate the list.

Kentaro Hara

unread,
Jan 10, 2017, 1:36:14 AM1/10/17
to Kenichi Ishibashi, Łukasz Anforowicz, platform-architecture-dev, Yuki Shiino, blink-reviews-bindings
Sorry if I'm misunderstanding:

Is the 95% accuracy really acceptable? As far as I understand, the clang plugin will be used not only when you do the Blink Big Rename but also every time people compile Blink in the future.

For example, what happens if firstChild is not supported in the clang plugin and someone use firstChild in her CL? Won't it be detected as a style error?

Daniel Cheng

unread,
Jan 10, 2017, 1:40:30 AM1/10/17
to Kentaro Hara, Kenichi Ishibashi, Łukasz Anforowicz, platform-architecture-dev, Yuki Shiino, blink-reviews-bindings
On Mon, Jan 9, 2017 at 10:36 PM Kentaro Hara <har...@chromium.org> wrote:
Sorry if I'm misunderstanding:

Is the 95% accuracy really acceptable? As far as I understand, the clang plugin will be used not only when you do the Blink Big Rename but also every time people compile Blink in the future.

For example, what happens if firstChild is not supported in the clang plugin and someone use firstChild in her CL? Won't it be detected as a style error?

The clang plugin does not do naming checks today, so we don't need to worry about this aspect. It would be nice to help make sure names are correct (since we'll have a mix of name types on some classes now), but I think that's a bit out of scope of this immediate work. Certainly, it'd be tricky to implement.

That being said, I think we want as close to 100% as possible, as the rebasing helpers will depend on this list as well.

Daniel
 
--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architect...@chromium.org.
To post to this group, send email to platform-arc...@chromium.org.

Kentaro Hara

unread,
Jan 10, 2017, 1:54:03 AM1/10/17
to Daniel Cheng, Kenichi Ishibashi, Łukasz Anforowicz, platform-architecture-dev, Yuki Shiino, blink-reviews-bindings
On Tue, Jan 10, 2017 at 3:40 PM, Daniel Cheng <dch...@chromium.org> wrote:
On Mon, Jan 9, 2017 at 10:36 PM Kentaro Hara <har...@chromium.org> wrote:
Sorry if I'm misunderstanding:

Is the 95% accuracy really acceptable? As far as I understand, the clang plugin will be used not only when you do the Blink Big Rename but also every time people compile Blink in the future.

For example, what happens if firstChild is not supported in the clang plugin and someone use firstChild in her CL? Won't it be detected as a style error?

The clang plugin does not do naming checks today, so we don't need to worry about this aspect. It would be nice to help make sure names are correct (since we'll have a mix of name types on some classes now), but I think that's a bit out of scope of this immediate work. Certainly, it'd be tricky to implement.

Oh, then after the Blink Big Rename, there's nothing that prevents people from using a camelCase method for non-IDL-exposed things, right? What is Chromium doing about it?

If you need to create a list of C++ methods only for the Blink Big Rename, it sounds like a very temporary thing -- then I'm okay with any hacky solution. For example, as bashi-san mentioned, we can hack the code generator to print all C++ methods (which will give you higher accuracy than hooking IDLReader and Visitor).

 
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsubsc...@chromium.org.
To post to this group, send email to platform-architecture-dev@chromium.org.

Daniel Cheng

unread,
Jan 10, 2017, 1:58:06 AM1/10/17
to Kentaro Hara, Kenichi Ishibashi, Łukasz Anforowicz, platform-architecture-dev, Yuki Shiino, blink-reviews-bindings
On Mon, Jan 9, 2017 at 10:54 PM Kentaro Hara <har...@chromium.org> wrote:
On Tue, Jan 10, 2017 at 3:40 PM, Daniel Cheng <dch...@chromium.org> wrote:
On Mon, Jan 9, 2017 at 10:36 PM Kentaro Hara <har...@chromium.org> wrote:
Sorry if I'm misunderstanding:

Is the 95% accuracy really acceptable? As far as I understand, the clang plugin will be used not only when you do the Blink Big Rename but also every time people compile Blink in the future.

For example, what happens if firstChild is not supported in the clang plugin and someone use firstChild in her CL? Won't it be detected as a style error?

The clang plugin does not do naming checks today, so we don't need to worry about this aspect. It would be nice to help make sure names are correct (since we'll have a mix of name types on some classes now), but I think that's a bit out of scope of this immediate work. Certainly, it'd be tricky to implement.

Oh, then after the Blink Big Rename, there's nothing that prevents people from using a camelCase method for non-IDL-exposed things, right? What is Chromium doing about it?

Today, nothing keeps methods (or any other identifier) from being incorrectly named other than code review. Looking carefully, there are definitely instances of naming violations throughout the codebase. However, it's easier to enforce the rule in Chromium itself because the naming is always consistent.

However, that's harder when the rule results in visually inconsistent names... we'll consider how to add a presubmit check for this and revisit this topic.

Daniel
 
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architect...@chromium.org.
To post to this group, send email to platform-arc...@chromium.org.

Kentaro Hara

unread,
Jan 10, 2017, 3:24:27 AM1/10/17
to Daniel Cheng, Kenichi Ishibashi, Łukasz Anforowicz, platform-architecture-dev, Yuki Shiino, blink-reviews-bindings
On Tue, Jan 10, 2017 at 3:57 PM, Daniel Cheng <dch...@chromium.org> wrote:
On Mon, Jan 9, 2017 at 10:54 PM Kentaro Hara <har...@chromium.org> wrote:
On Tue, Jan 10, 2017 at 3:40 PM, Daniel Cheng <dch...@chromium.org> wrote:
On Mon, Jan 9, 2017 at 10:36 PM Kentaro Hara <har...@chromium.org> wrote:
Sorry if I'm misunderstanding:

Is the 95% accuracy really acceptable? As far as I understand, the clang plugin will be used not only when you do the Blink Big Rename but also every time people compile Blink in the future.

For example, what happens if firstChild is not supported in the clang plugin and someone use firstChild in her CL? Won't it be detected as a style error?

The clang plugin does not do naming checks today, so we don't need to worry about this aspect. It would be nice to help make sure names are correct (since we'll have a mix of name types on some classes now), but I think that's a bit out of scope of this immediate work. Certainly, it'd be tricky to implement.

Oh, then after the Blink Big Rename, there's nothing that prevents people from using a camelCase method for non-IDL-exposed things, right? What is Chromium doing about it?

Today, nothing keeps methods (or any other identifier) from being incorrectly named other than code review. Looking carefully, there are definitely instances of naming violations throughout the codebase. However, it's easier to enforce the rule in Chromium itself because the naming is always consistent.

However, that's harder when the rule results in visually inconsistent names... we'll consider how to add a presubmit check for this and revisit this topic.

Yeah, I think this is a key question.

If you only need a one-time solution, I'm okay with any hack. However, if you plan to introduce a presubmit check, we should carefully think about what a right solution is, even if it needs some engineering efforts. (e.g., creating a dedicated Jinja template sounds nice to me.)


 
Daniel
 
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsub...@chromium.org.
To post to this group, send email to platform-architecture-dev@chromium.org.



--
Kentaro Hara, Tokyo, Japan

luk...@chromium.org

unread,
Jan 13, 2017, 1:21:42 PM1/13/17
to platform-architecture-dev, dch...@chromium.org, ba...@chromium.org, luk...@chromium.org, yukis...@chromium.org, blink-revie...@chromium.org
Status update - I have a WIP CL at https://crrev.com/2624413002 for extracting C++ methods from IDL using IDLReader.  There are still a few remaining issues that I need to flesh out before sending this CL out for review, but this approach seems promising so far.

On Tuesday, January 10, 2017 at 12:24:27 AM UTC-8, Kentaro wrote:
On Tue, Jan 10, 2017 at 3:57 PM, Daniel Cheng <dch...@chromium.org> wrote:
On Mon, Jan 9, 2017 at 10:54 PM Kentaro Hara <har...@chromium.org> wrote:
On Tue, Jan 10, 2017 at 3:40 PM, Daniel Cheng <dch...@chromium.org> wrote:
On Mon, Jan 9, 2017 at 10:36 PM Kentaro Hara <har...@chromium.org> wrote:
Sorry if I'm misunderstanding:

Is the 95% accuracy really acceptable? As far as I understand, the clang plugin will be used not only when you do the Blink Big Rename but also every time people compile Blink in the future.

For example, what happens if firstChild is not supported in the clang plugin and someone use firstChild in her CL? Won't it be detected as a style error?

The clang plugin does not do naming checks today, so we don't need to worry about this aspect. It would be nice to help make sure names are correct (since we'll have a mix of name types on some classes now), but I think that's a bit out of scope of this immediate work. Certainly, it'd be tricky to implement.

Oh, then after the Blink Big Rename, there's nothing that prevents people from using a camelCase method for non-IDL-exposed things, right? What is Chromium doing about it?

Today, nothing keeps methods (or any other identifier) from being incorrectly named other than code review. Looking carefully, there are definitely instances of naming violations throughout the codebase. However, it's easier to enforce the rule in Chromium itself because the naming is always consistent.

However, that's harder when the rule results in visually inconsistent names... we'll consider how to add a presubmit check for this and revisit this topic.

Yeah, I think this is a key question.

If you only need a one-time solution, I'm okay with any hack. However, if you plan to introduce a presubmit check, we should carefully think about what a right solution is, even if it needs some engineering efforts. (e.g., creating a dedicated Jinja template sounds nice to me.)
 
I think a robust presubmit is difficult or infeasible.  The main problem is with handling a situation where an operation or attribute accessor is implemented in a base class.  For example: Document.idl declares getElementsByTagName operation, but from IDL alone it is not possible to see that this operation is not implemented by blink::Document::getElementsByTagName, but by a method in one of blink::Document's base classes - blink::ContainerNode::getElementsByTagName.  AFAICT naive IDL mapping results in ~ 703 out of ~ 4673 IDL implementation methods not found in expected header files (on one hand this is a very ad-hoc, grep-based and possibly inaccurate estimation, but OTOH so far 95% of cases are caused by the inheritance problem described here).

Please note that if one incorrectly names IDL implementation method using PascalCase, instead of camelCase, then things won't compile (because DOM bindings generator will keep emitting references to methods using camelCased names).  Therefore no presubmit might be needed for this case.  OTOH, it is possible that we will slowly accumulate methods that unnecessarily use camelCase, despite not being needed by IDL - this is undesirable, but (as described in the previous paragraph) I don't currently see how to create a presubmit preventing this.


-Lukasz

(*) naive = expecting operations and accessors for Document IDL interface to be implemented in blink::Document [well except if ImplementedAs is present, but this doesn't change the inheritance problem on the C++ side of things].

Kentaro Hara

unread,
Jan 15, 2017, 7:52:07 PM1/15/17
to Łukasz Anforowicz, platform-architecture-dev, Daniel Cheng, Kenichi Ishibashi (via Google Docs), Yuki Shiino, blink-reviews-bindings
Sounds reasonable.

 

-Lukasz

(*) naive = expecting operations and accessors for Document IDL interface to be implemented in blink::Document [well except if ImplementedAs is present, but this doesn't change the inheritance problem on the C++ side of things].

--
You received this message because you are subscribed to the Google Groups "platform-architecture-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to platform-architecture-dev+unsub...@chromium.org.
To post to this group, send email to platform-architecture-dev@chromium.org.
Reply all
Reply to author
Forward
0 new messages