I think our code review process is partly to blame.
I can't fault myself for writing code that I think would be useful.
I can't fault others for being skeptical until they see it working.
I can't fault others for not wanting to spend time reviewing things
that don't add core functionality.
I can't fault others for being conservative about changes to core functionality.
So as I can't fault myself or others, I think the process must be to blame.
The process requires code review for all source code outside docs and
demos. It requires consensus on including, but not line-by-line
review for third-party code.
I think our process is preventing us from recognizing ways to improve
our process. To produce a quality product, we need to have useful
tools, and to have confidence in those tools. But it is hard to tell
whether tools are useful without trying them out on a trial basis.
And it is hard to convince others to spend the time reviewing source
for tools that they have no reason to believe are useful.
What I would like is to be able to deploy tools that use our core
code, but that are not used by it. Ideally, I would like to be able
to commit tools, add them to the build system in a way that does not
affect files built, and let people try them out. Then when people
realize it's useful, and want to start relying on them, they could
review them. Review might not need to be line-by-line, but could be
that there is sufficient test coverage for them to have confidence.
I think that third party is the wrong place to put these tools, since
they're not third party, and because the dependency direction is wrong
; our core code depends on third party, not the other way around.
I think that experimental is the wrong place, because it is not
maintained, and tests there are not run on presubmit so any code put
there is likely to rot. And there are no existing build system
dependencies on experimental.
I propose adding a directory that is a sibling of demos called
ancillary. Ancillary would be for tools that can be depended on by
the build system, but not by third party code. The team can use them,
but makes no guarantees that they're useful to others. Candidates
from ancillary that are generally useful, and that people rely on
heavily can be moved into com.google.caja.tools.
On Fri, Sep 11, 2009 at 12:52 PM, Mike Samuel <mikes...@gmail.com> wrote:
> I propose adding a directory that is a sibling of demos called
> ancillary. Ancillary would be for tools that can be depended on by
> the build system, but not by third party code. The team can use them,
> but makes no guarantees that they're useful to others. Candidates
> from ancillary that are generally useful, and that people rely on
> heavily can be moved into com.google.caja.tools.
If someone refactors the core classes for some reason, could this
break the ancillary classes? If so, whose responsibility is it to fix
the breakage? Given that the ancillary stuff will have been likely
only seen by one person, will they have to seek out the original
author to unbreak ancillary?
Ihab
--
Ihab A.B. Awad, Palo Alto, CA
First of all: I honestly don't have a clearly defined conclusion yet
since (fwiw...) I think the overall documentation scheme espoused by
this CL is actually pretty nifty.
One other way to look at it might be to think about what is special in
the Caja project that we could provide as unique value (beyond other
joe generic doc tools that might exist). Does our work encourage (or
require) a style of programming (with closures rather than prototypal
inheritance and Java-like "classes") that in turn requires this tool?
If so, is this tool generally useful for the world, or is it only
useful for Caja? If for Caja, should we optimize it as a doc tool *for
cajoleable code*? What, if any, simplifications do we gain by
narrowing the focus as such? If not just for cajoleable code, do we
best serve its purpose by making it a completely separate project?
One thing to think about is that, hopefully, our TCB of un-cajoled
code should remain small and get smaller. Hence maybe we get a bigger
win for something that cajoles Caja code (and, say, appends docs to
the module format in much the same way that the debug information is
currently added).
In any case, here's some data. After patching in this CL:
.../google-caja ihab$ svn diff | wc -l
9989
That's 10kloc for a documentation tool used as part of our build
process. In addition to the benefits of such a tool, we should be
worrying about cost.
Sure. If you include tests, documentation, unchanged context lines,
and whitespace lines; and double-count changed lines you get a big
number.
If you count only operational lines of code, it's 4012 LOC.
I was with you all the way up to this point. Why single out the build
system, particularly? What you're saying seems to apply equally to any
optional tool.
Also, if they're going to be unreviewed then it seems kinda mean to
put the burden on all committers to maintain them - seems to me we'd
need presubmit tests that were informational only - i.e. they can fail
and that's OK - ideally a submit that caused one of these tests to
fail would mail the list so whoever was interested in maintaining it
would know to take a look.
This also implies that they can't cause the build to fail if they
fail. Do you think this can be achieved?
Anything that affects files being output by the build system is a
possible security threat, and so people are right to be suspicious of
things that complicate the build.
But now that I think of it, anything that affects source editing or
review is in the same boat.
> Also, if they're going to be unreviewed then it seems kinda mean to
> put the burden on all committers to maintain them - seems to me we'd
> need presubmit tests that were informational only - i.e. they can fail
> and that's OK - ideally a submit that caused one of these tests to
> fail would mail the list so whoever was interested in maintaining it
> would know to take a look.
Yes, Ihab raised the same point and I think it's a good one.
I have been maintaining this outside the build system, and I am happy
to keep maintaining it.
I think the best approach is to have test failures for that branch
informational, but if you change something that breaks the build for
ancillary tools and it's not obvious how to fix it, just drop the
maintainer a line.
> This also implies that they can't cause the build to fail if they
> fail. Do you think this can be achieved?
Yes. I think it might require sending an email or two, and maybe
delaying a submit for a bit of coordination. But API changes to core
components tend to require that anyway -- we need to broadcast those
widely to make sure knowledge about core component APIs is widely
shared.
Perhaps, but then we would have a cyclic dependency.
So to build Caja, we would have two versions of caja on the classpath:
the one that includes our custom ant rules and the source to build,
and an older one included from the external project. That sounds like
a headache.
> --
> Cheers,
> --MarkM
>
I know this is rather tangential to your point, but we really hope to
cajole jquery, not tame it; then we don't have to maintain versions.
On Mon, Sep 21, 2009 at 7:42 PM, Mike Samuel <mikes...@gmail.com> wrote:
>
> What are the outstanding points of contention?
>
> Do people agree or disagree with the following (not all pair-wise
> consistent) statements:
> (1) There should be a way to integrate candidate tools that depend on
> the Caja core into the build process that does not require line-by-
> line review before the tool has proven useful.
Agree.
> (2a) The candidate tool developer should bear the burden of
> maintaining tools, at least until the team comes to rely on them -- at
> which time it is no longer a candidate.
Agree.
> (2b) A candidate tool developer should be able to add tests to
> AllTests and have them run on precommit.
So long as precommit allows a commit even if they fail.
> (2c) A candidate tool developer should be able to expect devs who
> change core APIs to get the tool compiling
Disagree.
> (2d) A candidate tool developer should not expect devs who change core
> Caja code to make sure that their tests run green.
Agree.
> (3a) Candidate tools can stop a build from completing.
Unclear what this means - ideally, they should not stop a build from
completing stuff that is independent of said tool.
> (3b) Candidate tools must not affect the output of a build target that
> succeeds.
Should be able to disable the tool and still build, would be a weaker
requirement that I think I'd be OK with - so they could change the
output, but you should be able to easily revert...
E.g., could a lint tool cause a particular build target to fail if it
notices a serious problem?
>> (3b) Candidate tools must not affect the output of a build target that
>> succeeds.
>
> Should be able to disable the tool and still build, would be a weaker
> requirement that I think I'd be OK with - so they could change the
> output, but you should be able to easily revert...
One of the reasons we require code review is to guard against
malicious code from making it into the codebase. Unreviewed tools
could, theoretically, modify built files and so violate integrity.
This is also true of third party code and tools like javac. I don't
think we can really rely on code review to guard the tools in our TCB
but I listed it as an issue because others might have different
opinions.
So long as I can turn it off easily, yes.
>>> (3b) Candidate tools must not affect the output of a build target that
>>> succeeds.
>>
>> Should be able to disable the tool and still build, would be a weaker
>> requirement that I think I'd be OK with - so they could change the
>> output, but you should be able to easily revert...
>
> One of the reasons we require code review is to guard against
> malicious code from making it into the codebase. Unreviewed tools
> could, theoretically, modify built files and so violate integrity.
> This is also true of third party code and tools like javac. I don't
> think we can really rely on code review to guard the tools in our TCB
> but I listed it as an issue because others might have different
> opinions.
That's a good point. Presumably, though, you'd want the option for
them to do exactly that - suppose they're an optimiser, for example.
But you'd want it off for official bullds, and both off and on for
tests.
>
It will be built to a separate class directory from ant-lib, so that
our regular code cannot depend on it -- that ant-ancillary-lib will
not be on the default source path or class path. But it will be on the
path for our ant extensions.
We can then define build rules that use it, and those are fair game in
the build system.
I will have the ancillary classes on the test source path/class path
so that the ancillary tests are run as normal, but will make sure that
their failures do not stop submit.
I will add a build target no-ancillary that disables any build targets
that use them, so that they can be turned off for production releases,
and in case they go haywire.
Initially, svn commit checks will require that they vuild, but if they
break frequently, we can revisit that decision.
Does that work for everyone?
2009/9/23 Ben Laurie <be...@google.com>:
Does silence imply assent here?
--
2009/9/24 Mike Samuel <mikes...@gmail.com>:
> How about this:Does silence imply assent here?
> com.google.caja.ancillary becomes a place for tools that can be submitted
> to with cursory review
>
> It will be built to a separate class directory from ant-lib, so that
> our regular code cannot depend on it -- that ant-ancillary-lib will
> not be on the default source path or class path. But it will be on the
> path for our ant extensions.
>
> We can then define build rules that use it, and those are fair game in
> the build system.
>
> I will have the ancillary classes on the test source path/class path
> so that the ancillary tests are run as normal, but will make sure that
> their failures do not stop submit.
>
> I will add a build target no-ancillary that disables any build targets
> that use them, so that they can be turned off for production releases,
> and in case they go haywire.
>
> Initially, svn commit checks will require that they vuild, but if they
> break frequently, we can revisit that decision.
>
> Does that work for everyone?
In the interests of moving forward, my vote is +1.
Longer term, I would urge us to think about the following two
directions I alluded to earlier:
1. If the Caja parse tree representation and what not are building
blocks for general tools, we should refactor Caja to make it easier to
import *just* these pieces. Then perhaps projects like the linter and
jsdocer can be first-class projects in their own right. The advantage
is that people who *just* want that stuff can get it without having to
schlep down all of Caja.
2. On the other hand, if there are opportunities in these tools to
build cool Caja-specific features, we should concentrate on that
addition of unique value to the Caja project.
Do you disagree about my conclusions re cyclic dependencies in this scenario?
I don't disagree with your reasoning. I do think that circular deps
are part of the universe and there are ways to finesse the situation
(which, I claim, is the best we can do).
Assume Caja exports "caja-parse.jar" which is some minimal parser
schmivit. Project "jsdoc" can include this, and package it in its own
distro. Project Caja can then re-import the whole thing and run it
from Ant with "fork=true" so that the CLASSPATH is not polluted. That
way, also project "jsdoc" and Caja can evolve independently.
Alternatively, Caja can just pull in *only* the jsdoc stuff from
project "jsdoc" and snapshot a tested version of "jsdoc" that works
with the current Caja parse tree. If the Caja parse tree changes in a
way that breaks "jsdoc", then that's something to negotiate --
essentially it means that Caja broke "jsdoc" and the resulting woes
are visited upon all users of "jsdoc" including Caja.
But again -- I really don't want to hold this up looking for a more
perfect solution (whether or not there is agreement that "my"
solution(s) are more perfect). I believe we should just proceed and
see how things look.
Does fork=true work on ant verbs other than junit?
It works with the <java> task, which is presumably all we need. :)
Ihab
ps Yes, Java got CLASSPATH namespace hygiene hugely wrong. (Yes you
can build on that using classloader magic, but that's no longer
idiomatic Java programming.) Don't blame me; I didn't mess that one
up. Where I *can* lend my voice, I'm doing my best.
http://wiki.commonjs.org/wiki/Packages. ;)
I think we have a wiki page on code reviews so I'll update that first.
I'll send out CLs to patch build.xml and such to make tools test
breakage non-blocking.
Once those are in I'll check in the linter, and once I've sandboxed
Rhino to address Jas's comments on jsdoc, I'll check that in too.
If there are problems with maintenance, we can cut that stuff out of
the loop, and look into splitting up the codebase. We would need to
retain checkin control over the core code, since things like the
renderer are in our TCB.
2009/9/28 <ihab...@gmail.com>:
It would be nice to have an automated way to build both with and without
ancillaries and sanity-check that the outputs are the same. (I'm more
concerned here with accidental dependencies; if the ancillaries were
malicious then this wouldn't be sufficient.) Don't let this suggestion
hold you up, though.
--
David-Sarah Hopwood ⚥ http://davidsarah.livejournal.com
Silence implies travel in my case, but sounds good.
I think Ihab was right that separating out non-core components would
be really neat, but I argued that it would introduce classpath
problems. I didn't know how svn:externals worked at the time, but now
I think it might give us a way to have the advantage of separating
ancillary out into a separate project without having the multiple jar
problem.
Each project would use svn:externals to import a specific version of
source code from the other. So each could build and evolve
separately.
So com.google.caja.ancillary could move into a separate project, for
sake of argument called google-caja-tools.
In the below, -> means a source tree included via svn:externals.
google-caja's code tree might look like
src/
com/
google/
caja/
ancillary/ -> google-caja-tools/src/com/google/caja/ancillary @ r1234
...
and google-caja-tools code tree might look like
src/...
core-src/ -> google-caja/src/ @ r2345
third-party -> google-caja/third-party/ @ r2345
so each could import the other without pre-compiling jars, and eclipse
projects for each could allow easy browsing and editing of the other's
source and checking out one would still be a single step process.