The "Sandbox Model" addons.mozilla.org uses to organize and review add-ons was first announced almost 3 years ago (http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/). Since then, we've made a number of changes based on user feedback that, in my opinion, have greatly improve the experience of finding and installing add-ons that haven't been officially reviewed yet.
Today, the main feedback concerning the review and distribution process of add-ons is: * developers feel it takes too long for add-ons to be reviewed, and * users and developers want to receive updates to add-ons that they have installed that haven't been reviewed yet
It's important for us to balance our desire for all add-ons to be discoverable and easy to install with the need for security measures for add-ons that haven't been reviewed yet.
After taking many of these issues into account, I've come up with a proposal for removing the public and sandbox classifications on the site and moving to a more flexible, comprehensive trust system based on everything we know about an add-on. If you're interested in the review process and distribution of add-ons, please read the proposal below and give us your feedback.
Another trust modifier could be how active the add-on is. An add-on that is updated often would benefit more from the the trusted status. Though, i don't see a difference between trusted status and public status with the removal of the sandobx (maybe it is mentioned and I just missed it).
The new trust system seems like a positive new change to AMO. I look forward to the day where I can have updates being distributed without having going through editors.
> The "Sandbox Model" addons.mozilla.org uses to organize and review > add-ons was first announced almost 3 years ago > (http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/). > Since then, we've made a number of changes based on user feedback that, > in my opinion, have greatly improve the experience of finding and > installing add-ons that haven't been officially reviewed yet.
> Today, the main feedback concerning the review and distribution process > of add-ons is: > * developers feel it takes too long for add-ons to be reviewed, and > * users and developers want to receive updates to add-ons that they have > installed that haven't been reviewed yet
> It's important for us to balance our desire for all add-ons to be > discoverable and easy to install with the need for security measures for > add-ons that haven't been reviewed yet.
> After taking many of these issues into account, I've come up with a > proposal for removing the public and sandbox classifications on the site > and moving to a more flexible, comprehensive trust system based on > everything we know about an add-on. If you're interested in the review > process and distribution of add-ons, please read the proposal below and > give us your feedback.
Having just read the proposal, I like it a lot. Not only will it get add-ons to the people faster, it should also provide a more accurate and knowledgable "shopping" experience. I noticed the Trust Indicators, but I didn't see anything about user reviews and press. While these might be a bit more edge case in their occurrence, I think they are quite valid in determining the trust of an add-on.
All in all, great proposal. Keep up the great work!
A mandatory code review is the only way to keep malware at bay in my opinion. Everything else can (and will) be cheated.
So I'd keep the current system, but would add the ability for users to find new or experimental addons more easily, for example by introducing an "Experimental" category, or by providing new search or view modes like "only recommended addons", "new addons", "all public addons", "only experimental addons", "all addons" and to sort by rating, popularity, age, application version, number of reviews etc.
That way it's not as bad being experimental since there might be people who are actively searching for new stuff to try out. Currently it is pretty difficult finding experimental addons without the editor review queue.
Cesar's suggestion of author activity is a good trust modifier. It might prevent addons from being abandoned.
Anthony's suggestion of also including the user reviews and press would also be nice but how would we keep track of that information?
Trust score possible issue is that as we are tweaking it to prevent gaming these adjustments might take an addon from public to sandbox or vice versa.
Also we might get into addon wars...where one addon user base will purposefully try to lower a Trust score of another.
Trying to figure out how to curb the gaming is going to be an interesting exciting process... especially if the Trust Score system is going to be transparent.
While it would be nice to be able to speed up the review process (one add-on of mine took about 8 months for the initial review), I do think this could be misused if not implemented properly. There have been examples of "trusted" add-ons doing untrusted things.
Regarding code reviews, except perhaps on the initial update, are they even done currently? I know a number of add-ons (including mine) which appear to be reviewed simply by seeing if they install. No actual testing is done, let alone a code review. I've actually had releases with fairly major bugs in them which I missed before the release went public. Unfortunately, while the buggy version got pushed out within a day, the fixed version took over a week to get reviewed.
I think a good potential trust modifier could be a percent change from the previous version. I'm not even sure if this is something that could be measured automatically or not, but I don't see why a minor release for an add-on should take as long (or longer) to review as a major one.
I have to say this would be great, and can smooth the process a lot. The implementation will make a big difference, however. * The 1 - 4,999 bucket (with no installation) should really be a transitional one, because not having updates is problematic for users. In other words, if it is as hard as passing 5,000 than it is today to get out of the sandbox, the problem will stay. * I understand that the 0 or less is really for extensions who behaved badly and get punished. But we have to make sure that once an extension get there, it doesn't get stuck forever. There is a risk that this happens because if it can't be installed it won't have users (no point), so it won't get rated (no point) so it have little chance to rise even if the author fixed the problem that caused the punishment. One option would be to have it blocked for one month, then back to 1 point. Unless of course getting under 0 is only for incomplete add-ons.
Also, that may sound obvious but why not add some points for extensions that are or have been recommended by AMO? That's definitely a proof of trust.
It's really cool to hear about release channels also, it's something developers have been waiting for a long time.
> The "Sandbox Model" addons.mozilla.org uses to organize and review > add-ons was first announced almost 3 years ago > (http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/). > Since then, we've made a number of changes based on user feedback that, > in my opinion, have greatly improve the experience of finding and > installing add-ons that haven't been officially reviewed yet.
> Today, the main feedback concerning the review and distribution process > of add-ons is: > * developers feel it takes too long for add-ons to be reviewed, and > * users and developers want to receive updates to add-ons that they have > installed that haven't been reviewed yet
> It's important for us to balance our desire for all add-ons to be > discoverable and easy to install with the need for security measures for > add-ons that haven't been reviewed yet.
> After taking many of these issues into account, I've come up with a > proposal for removing the public and sandbox classifications on the site > and moving to a more flexible, comprehensive trust system based on > everything we know about an add-on. If you're interested in the review > process and distribution of add-ons, please read the proposal below and > give us your feedback.
This looks really good. Is there a mapping that shows which of the trut states would show up in the search results in the Firefox add-ons manager? I guess I'm assuming only the 10,000+ ranges since we have no checkbox there as such.
Is the trust score for the add-on as a whole or just a single version of the add-on? If the latter then have you considered a kind of trust cascade system where a newly posted version of an add-on inherits a percentage of the trust of the old version. This could effectively mean that an update for an add-on that already has a very high trust score could become public and arrive in user's applications very quickly.
I think that designing trust systems that are hard to game is a really, really difficult problem, and you need to talk to people who've done it before you try :-) Advogato (http://www.advogato.org/) was an early platform for this sort of research. http://www.advogato.org/trust-metric.html
Let's adopt the perspective of a bad actor who wants to get a malicious extension onto AMO so people will install it and he can steal their data or control their machine, and go through the proposed inputs and have a think about which can be gamed, and how. Of course, the more metrics there are, you have questions about how the scores combine. If scores are, say, added, a bad actor may only need to game a few metrics to get their scores above whatever the magic figure is.
# Editor Review - an editor's assessment of the add-on - No more gameable than it is now, although editors may do less detailed work if they are relying on the trust system, and if the aim of the exercise is to reduce the amount of editorial control needed
# Active Users - the number of users who have the extension installed - Presumably measured by update pings? Very easily gameable.
# Ratings - the Bayesian rating of an add-on based on all user reviews - Given that we don't control accounts very well, this would be fairly easily gameable too - just robot in good reviews.
# Flags - the number of times a user has flagged the add-on as a violation (to be implemented) - Not gameable, as the bad actor cannot reduce the number. But of course you need to implement it. And if people's data is being stolen or their privacy is being violated, they may well not notice so they won't flag it. This is also an "enumerate badness" model.
# Add-on Verification Tool - automated check of add-on packaging, adherence to policies, and common security problems (to be implemented; see spec) - Given that the tool will be free software, malware can be written to pass the checks. Given JavaScript's ability to create code from strings, I suspect it's very hard to write a full fidelity code checker. This is the halting problem.
# Support Information - does the author provide a support URL or e-mail address? - Trivially gameable.
# Other Add-ons by the Developer - how much do we trust the other add-ons this developer has made? - Gameable only in that we can apply the same gaming tactics to the other add-ons.
In the thread, other people have suggested:
# How active the add-on author is (Cesar Oliviera) - Easily gameable.
# % change from previous version (Morac) - Easily gameable. Make most of the changes you want in a previous version, then do a small update which just enables the nastiness.
All in all, not a great result for non-gameability.
I would suggest that a better approach would be to trust people, in an Advogato-like model, and have that trust flow through to extensions and other people only so far as the trusted people are willing to endorse them. This is sort of like a modified version of the current system, which is effectively that extensions go from 0% trusted to 100% trusted upon the endorsement of a single reviewer. Instead, we could encourage add-on authors to become part of the trusted, reviewing and endorsing community, and express their preferences for trustworthy addons on the site. The value of their endorsement would depend on their trustworthiness, which would depend on the trustworthiness of those who trust them, and so on. The "trust anchors" would be the existing reviewers.
Gervase Markham wrote: > # Flags - the number of times a user has flagged the add-on as a > violation (to be implemented) > - Not gameable, as the bad actor cannot reduce the number.
Easily misusable though by people who want to make life harder for their "competitor" add-ons.
I think we need to care of people wanting to game up their own score as well as one trying to game down scores of others (either because of competition or because of personal dislike or such).
What I like a lot in there is the concept of update channels coming to AMO with it. I talked to and mailed Rey Bango a while ago about a few add-ons (which probably would be in a trusted state - language packs, lightning, i.e. things coming from our own Mozilla repos) that would even like to provide nightly updates through AMO if possible. Update channels are one thing needed for those as well (would be a "nightly" channel there, but the basically same strategy).
> # How active the add-on author is (Cesar Oliviera)
I don't quite see why activity alone should be a thrust modifier. If a project is updated every day, it could also mean that the code isn't properly tested and that the developers have no plan where they are heading to.
Some of the addons I consider essential haven't been updated in years because they simply work as promised by its developers.
Justin Scott wrote: > The "Sandbox Model" addons.mozilla.org uses to organize and review > add-ons was first announced almost 3 years ago > (http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/). > Since then, we've made a number of changes based on user feedback that, > in my opinion, have greatly improve the experience of finding and > installing add-ons that haven't been officially reviewed yet.
> Today, the main feedback concerning the review and distribution process > of add-ons is: > * developers feel it takes too long for add-ons to be reviewed, and > * users and developers want to receive updates to add-ons that they have > installed that haven't been reviewed yet
> It's important for us to balance our desire for all add-ons to be > discoverable and easy to install with the need for security measures for > add-ons that haven't been reviewed yet.
> After taking many of these issues into account, I've come up with a > proposal for removing the public and sandbox classifications on the site > and moving to a more flexible, comprehensive trust system based on > everything we know about an add-on. If you're interested in the review > process and distribution of add-ons, please read the proposal below and > give us your feedback.
While I agree that the current situation is (rightly) perceived as problematic by (some) add-on authors, I am not sure this is the right solution. Gerv has already provided a good summary of how gameable this system is, and I would tend to agree with what he has said. Although the happiness of our add-on authors and editors is important, I think the happiness of our end users is still more important, and a system that is as gameable as this will open the door significantly wider than it is now in terms of allowing malicious add-ons.
I will add that malice and user trust is not everything, so I would be skeptical even of the system Gerv suggests: although I do not have statistics (could we get some?) I believe a fairly significant number of the add-ons/updates we reject are rejected based on security issues, and a literally huge number of rejections are based on not namespacing/wrapping code in the browser window. In the first case, the add-on will work extremely well for all users - until they join an insecure wireless network that exploits the security issues in the add-on (eval-ing source procured over http, or similar issues). In the second, it will work well until they install some other add-on which defines functions with the same name, after which one will stop working -- diagnosing these issues is a massive headache. Like Gerv, I doubt whether any of this could easily be caught by automated tools (although "mistakes" are definitely easier to find than purposeful malice).
In the end, I think code review by editors is one of the best ways we have of protecting our users, and I think that that is more important than the discomfort of add-on authors or editors.
Cheers, Gijs
Co-developer of Venkman and ChatZilla, author of Chrome List, AMO editor.
Gervase did a nice job of summarizing how gameable such a system would be.
Now, if the aim is to speed up the review process and make life easier for editors, while at least not sacrificing the security or privacy of add-on users, I would propose the following:
- Provide a technical feedback channel for addons, just like the user reviews, but for reporting defects, breaches of privacy policy, copyright violations etc. But also for positive feedback like "works well on MacOS X". These feedbacks should be viewable by the author and editors, and maybe (should be open for discussion) by normal users. Technical feedback could be categorized into: Bug reports, feature requests, security bug reports, breaches of privacy policy and reports of successful usage scenarios. This could be used to calculate a functional trust score.
- Technical feedbacks should be moderatable like reviews. Resolved bugs, feature requests or false technical issues should not negatively impact an addon for long. Maybe remove old technical feedbacks a week after an update.
- Make it the primary job of the editors to do code reviews of Addons. Keep that down to checks for security and bad coding practice (loose globals, uses of eval, binary code, obfuscated js etc.).
- Addons that have not been code-reviewed stay in the sandbox
- Rely on the user base for functional testing and bug reports
- If an addon is code reviewed, has been downloaded a while from the sandbox and does not have important technical defects reported, let it go public..
- Every existing public addon should be able to provide a current "beta" alongside the stable version. Users have to check a checkbox in order to download the untrusted beta version, Separate technical reports for the beta should be possible.
Let me also propose a heuristic engine to detect potential malicious addons, although that would be sort-of complicated, but not certainly impossible. We could set some actions that are undesirable, like sending bookmarks to the web in clear text or anything that could concern the privacy of the user. Basically, with this engine, we could stop a malicious update to a trusted addon from happening.
Ohh, I know, I have too much imagination.
On Jul 1, 11:37 pm, Justin Scott <flig...@mozilla.com> wrote:
> The "Sandbox Model" addons.mozilla.org uses to organize and review > add-ons was first announced almost 3 years ago > (http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/). > Since then, we've made a number of changes based on user feedback that, > in my opinion, have greatly improve the experience of finding and > installing add-ons that haven't been officially reviewed yet.
> Today, the main feedback concerning the review and distribution process > of add-ons is: > * developers feel it takes too long for add-ons to be reviewed, and > * users and developers want to receive updates to add-ons that they have > installed that haven't been reviewed yet
> It's important for us to balance our desire for all add-ons to be > discoverable and easy to install with the need for security measures for > add-ons that haven't been reviewed yet.
> After taking many of these issues into account, I've come up with a > proposal for removing the public and sandbox classifications on the site > and moving to a more flexible, comprehensive trust system based on > everything we know about an add-on. If you're interested in the review > process and distribution of add-ons, please read the proposal below and > give us your feedback.
> The "Sandbox Model" addons.mozilla.org uses to organize and review > add-ons was first announced almost 3 years ago > (http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/). > Since then, we've made a number of changes based on user feedback that, > in my opinion, have greatly improve the experience of finding and > installing add-ons that haven't been officially reviewed yet.
> Today, the main feedback concerning the review and distribution process > of add-ons is: > * developers feel it takes too long for add-ons to be reviewed, and > * users and developers want to receive updates to add-ons that they have > installed that haven't been reviewed yet
> It's important for us to balance our desire for all add-ons to be > discoverable and easy to install with the need for security measures for > add-ons that haven't been reviewed yet.
> After taking many of these issues into account, I've come up with a > proposal for removing the public and sandbox classifications on the site > and moving to a more flexible, comprehensive trust system based on > everything we know about an add-on. If you're interested in the review > process and distribution of add-ons, please read the proposal below and > give us your feedback.
Hi,
I think we should definitely drop the terms public and sandbox from the site as they've acquired negative connotations along the way, furthering the impression that the first thing you have to do is get your add-on public (which often got immediately rejected, discouraging further effort).
I still think code review is a vital part of protecting the less experienced users from poorly written or malicious add-on though. The trust model, while in an ideal world a good replacement, is inherently flawed as Gerv explained already.
I agree with what Gijs has written summarising the common reasons we reject addon submissions and how a trust model based on popularity will never pick up these reasons reliably.
The Trust Score would be a useful measure if it was advisory but I'm not convinced any score should equal one proper code review. We could use the user reviews and negative reports to provide the user a more rounded view of an unapproved/experimental/unchecked add-on before they install it though:
e.g. \/ This Addon has passed our automated Checks [what does this mean?] \/ Other Users rate this Addon as Good (87%) [what does this mean?] X We've not Checked this Addon to make sure its well written and not malicious [what does this mean?]
Providing updates to unchecked add-ons should be possible but only if the user understands and accepts it. Ideally the client (firefox) would prompt the user in a different way to make it clear they were getting something unchecked also.
With some types of Add-ons, such as Themes, Language Packs and Search Plugins though we could probably get away without any editor involvement because of the lack of code.
A way to maybe speed up addon reviews is that if authors are required to provide some steps to test their addon and any necessary test files or accounts or what not for testing.
We prevent robot ratings (or any type of robotness) by implementing some sort of recaptchas. We could do the same thing before installing/updating addons...it might be a slight inconvenience having to do recaptchas before updating or installing an addon.
We could centralise all addon author activity and support information in AMO. Maybe users can rate how good the support is?
I gotta agree the % changed is a weak metric for trust points anyone can just change filenames around and spoof percentages or add extra lines here and there.
I like the notion of having user dole out trust points on addons and on other users. So the more the user reviews addons the more points they have...but if they give poor reviews other users can give them a negative rating which would give their reviews less weight. It will be self-policing...but you could get a black market of a group of power users who just go around and take people and addons out lol.
I think the verification tool is where the challenge lies in this whole trust system. We would need security checks to prevent authors from spoofing the tool before it does it job. I can see Editors using this verification tool the most for some reason...I think it would help in cases where the authors nominated addon is just massive.
We could rename Public to: Colour + Vulpes? (Fox?) ie Blue Fox or just Vulpes? or Gadget?
> I think that designing trust systems that are hard to game is a really, > really difficult problem, and you need to talk to people who've done it > before you try :-) Advogato (http://www.advogato.org/) was an early > platform for this sort of research.http://www.advogato.org/trust-metric.html
> Let's adopt the perspective of a bad actor who wants to get a malicious > extension onto AMO so people will install it and he can steal their data > or control their machine, and go through the proposed inputs and have a > think about which can be gamed, and how. Of course, the more metrics > there are, you have questions about how the scores combine. If scores > are, say, added, a bad actor may only need to game a few metrics to get > their scores above whatever the magic figure is.
> # Editor Review - an editor's assessment of the add-on > - No more gameable than it is now, although editors may do less > detailed work if they are relying on the trust system, and if the > aim of the exercise is to reduce the amount of editorial control > needed
> # Active Users - the number of users who have the extension installed > - Presumably measured by update pings? Very easily gameable.
> # Ratings - the Bayesian rating of an add-on based on all user reviews > - Given that we don't control accounts very well, this would be > fairly easily gameable too - just robot in good reviews.
> # Flags - the number of times a user has flagged the add-on as a > violation (to be implemented) > - Not gameable, as the bad actor cannot reduce the number. But of > course you need to implement it. And if people's data is being > stolen or their privacy is being violated, they may well not notice > so they won't flag it. This is also an "enumerate badness" model.
> # Add-on Verification Tool - automated check of add-on packaging, > adherence to policies, and common security problems (to be > implemented; see spec) > - Given that the tool will be free software, malware can be written > to pass the checks. Given JavaScript's ability to create code > from strings, I suspect it's very hard to write a full fidelity > code checker. This is the halting problem.
> # Support Information - does the author provide a support URL or e-mail > address? > - Trivially gameable.
> # Other Add-ons by the Developer - how much do we trust the other > add-ons this developer has made? > - Gameable only in that we can apply the same gaming tactics to the > other add-ons.
> In the thread, other people have suggested:
> # How active the add-on author is (Cesar Oliviera) > - Easily gameable.
> # % change from previous version (Morac) > - Easily gameable. Make most of the changes you want in a previous > version, then do a small update which just enables the nastiness.
> All in all, not a great result for non-gameability.
> I would suggest that a better approach would be to trust people, in an > Advogato-like model, and have that trust flow through to extensions and > other people only so far as the trusted people are willing to endorse > them. This is sort of like a modified version of the current system, > which is effectively that extensions go from 0% trusted to 100% trusted > upon the endorsement of a single reviewer. Instead, we could encourage > add-on authors to become part of the trusted, reviewing and endorsing > community, and express their preferences for trustworthy addons on the > site. The value of their endorsement would depend on their > trustworthiness, which would depend on the trustworthiness of those who > trust them, and so on. The "trust anchors" would be the existing reviewers.
> Gerv
Lest anyone miscontrue my previous comments, I will try to make myself 100% clear. The current system relies too heavily on easily fakeable positive reviews of the extension. A system that relies more on the reputation of the author would be 100% better than what we have now. I am not saying that the proposed solution does this or is the correct solution.
> I think that designing trust systems that are hard to game is a really, > really difficult problem, and you need to talk to people who've done it > before you try :-) Advogato (http://www.advogato.org/) was an early > platform for this sort of research.http://www.advogato.org/trust-metric.html
> Let's adopt the perspective of a bad actor who wants to get a malicious > extension onto AMO so people will install it and he can steal their data > or control their machine, and go through the proposed inputs and have a > think about which can be gamed, and how. Of course, the more metrics > there are, you have questions about how the scores combine. If scores > are, say, added, a bad actor may only need to game a few metrics to get > their scores above whatever the magic figure is.
> # Editor Review - an editor's assessment of the add-on > - No more gameable than it is now, although editors may do less > detailed work if they are relying on the trust system, and if the > aim of the exercise is to reduce the amount of editorial control > needed
> # Active Users - the number of users who have the extension installed > - Presumably measured by update pings? Very easily gameable.
> # Ratings - the Bayesian rating of an add-on based on all user reviews > - Given that we don't control accounts very well, this would be > fairly easily gameable too - just robot in good reviews.
> # Flags - the number of times a user has flagged the add-on as a > violation (to be implemented) > - Not gameable, as the bad actor cannot reduce the number. But of > course you need to implement it. And if people's data is being > stolen or their privacy is being violated, they may well not notice > so they won't flag it. This is also an "enumerate badness" model.
> # Add-on Verification Tool - automated check of add-on packaging, > adherence to policies, and common security problems (to be > implemented; see spec) > - Given that the tool will be free software, malware can be written > to pass the checks. Given JavaScript's ability to create code > from strings, I suspect it's very hard to write a full fidelity > code checker. This is the halting problem.
> # Support Information - does the author provide a support URL or e-mail > address? > - Trivially gameable.
> # Other Add-ons by the Developer - how much do we trust the other > add-ons this developer has made? > - Gameable only in that we can apply the same gaming tactics to the > other add-ons.
> In the thread, other people have suggested:
> # How active the add-on author is (Cesar Oliviera) > - Easily gameable.
> # % change from previous version (Morac) > - Easily gameable. Make most of the changes you want in a previous > version, then do a small update which just enables the nastiness.
> All in all, not a great result for non-gameability.
> I would suggest that a better approach would be to trust people, in an > Advogato-like model, and have that trust flow through to extensions and > other people only so far as the trusted people are willing to endorse > them. This is sort of like a modified version of the current system, > which is effectively that extensions go from 0% trusted to 100% trusted > upon the endorsement of a single reviewer. Instead, we could encourage > add-on authors to become part of the trusted, reviewing and endorsing > community, and express their preferences for trustworthy addons on the > site. The value of their endorsement would depend on their > trustworthiness, which would depend on the trustworthiness of those who > trust them, and so on. The "trust anchors" would be the existing reviewers.
> Gerv
Other issues with the current AMO approval process have to do with why updated extensions for new release versions are e\not available until after the release have to do with the current approval process. If you try to release when you are asked you get to the end of some queue which actually has to go through the entire defined review process. If you wait till the new version of Firefox is released and then submit your update as changes required to work with the new version it seems to be just rubber-stamped.
A trust system could theoretically work, but I think it may be too hard to balance and maintain in any reasonable way as described. Personally, I like having to wait for a review of some kind; I like that I have to be checked and approved by someone else staring at the code to make sure everything is ok. I just hate the long time it takes.
I think three things could help here: 1) Take better advantage of the addition of release channels. 2) Split editor reviews into two parts. 3) Formalize user feedback pending public status via a survey we ask users to fill out.
My very rough suggestion:
* Submission phase - All add-on files go through automatic security and consistency checks. Failure gives automatic rejection. Success with or without flags progresses. \/ * Experimental phase - New file is listed for unstable release channel - No auto-updating - Add-on gets red background and red button - Checkbox and popup "I know this is unreviewed and may explode" box \/ Phase I editor review - Check for showstopper issues only, namely security problems, high stability risks, and policy violations. Don't care if it's full of other bugs or potential issues. (though, if any are found the developer should probably be notified that they'll be rejected in phase II) This check should take significantly less time and effort than a full review, and thus can be done more frequently and sooner. \/ * Unstable phase - Auto-updating within unstable release channel allowed - Add-on gets orange background and orange button - Checkbox and light warning \/ User surveys - After install of any unstable add-on, AMO will automatically ask the user to come back later and fill out a quick survey. (results viewable by developers and editors) If after some wait period no explosions are reported, proceed to next editor review. \/ Phase II editor review - Detailed check of code and function for all issues. Make sure it works fully, lacks any noticeable bugs, and isn't likely to have future problems easily. A full editor review to decide if the add-on meets AMO's quality standards for a public add-on. \/ * Stable phase - File may be moved into stable release channel - Gets green background and button - Checkbox goes away
The concept of a trust rating could still be very useful to sort the queue of add-ons waiting for phase II editor review so that more trusted add-ons get checked out sooner. Less trusted add-ons would wait in the queue longer, presumably to build up trust.
Another trust modifier: how long has the add-on / it's author been around?
You should let users decide which add-ons they want updated automatically. And that should be trust modifier too. Add-ons window could have check box for automatic updates beside every extension. Trust score only sets the default for this. Maybe a warning can be displayed when user sets automatic update for low score add-on, but he/she should still be able to set it and get it. Not enabling this is to make users life harder.
We still have to go through editors. I liked DaveG's idea about splitting editor reviews into two parts. I suggest that code reviewers could have more points, more weight for their non-code reviews also, to encourage code reviewing.
> Lest anyone miscontrue my previous comments, I will try to make myself > 100% clear. > The current system relies too heavily on easily fakeable positive > reviews of the extension. > A system that relies more on the reputation of the author would be > 100% better than what we have now. > I am not saying that the proposed solution does this or is the correct > solution.
We don't require reviews of extensions any more to gain public status (we haven't for a few months).
The 2 phase review idea is interesting but I'm not sure it would speed up the process as much as you think. Currently, for most addons its the code review which takes the time and the testing is just a quick check to make sure its main function appears to work (the developer & experimental downloaders of having tested it thoroughly before its nominated). Having to check the code twice (once in phase1 and then again in phase2) will easily end up taking more time overall for a single addon.
So if the majority of add-ons only go to phase1 you'll get a small speed up. If most developers don't want an 'unstable' tag next to their work (most developers currently don't want an 'experimental' tag; hence the amount of submissions) then the overall workload will be at best roughly the same.
> A trust system could theoretically work, but I think it may be too > hard to balance and maintain in any reasonable way as described. > Personally, I like having to wait for a review of some kind; I like > that I have to be checked and approved by someone else staring at the > code to make sure everything is ok. I just hate the long time it > takes.
> I think three things could help here: > 1) Take better advantage of the addition of release channels. > 2) Split editor reviews into two parts. > 3) Formalize user feedback pending public status via a survey we ask > users to fill out.
> My very rough suggestion:
> * Submission phase > - All add-on files go through automatic security and consistency > checks. Failure gives automatic rejection. Success with or without > flags progresses. > \/ > * Experimental phase > - New file is listed for unstable release channel > - No auto-updating > - Add-on gets red background and red button > - Checkbox and popup "I know this is unreviewed and may explode" box > \/ > Phase I editor review > - Check for showstopper issues only, namely security problems, high > stability risks, and policy violations. Don't care if it's full of > other bugs or potential issues. (though, if any are found the > developer should probably be notified that they'll be rejected in > phase II) This check should take significantly less time and effort > than a full review, and thus can be done more frequently and sooner. > \/ > * Unstable phase > - Auto-updating within unstable release channel allowed > - Add-on gets orange background and orange button > - Checkbox and light warning > \/ > User surveys > - After install of any unstable add-on, AMO will automatically ask the > user to come back later and fill out a quick survey. (results viewable > by developers and editors) If after some wait period no explosions are > reported, proceed to next editor review. > \/ > Phase II editor review > - Detailed check of code and function for all issues. Make sure it > works fully, lacks any noticeable bugs, and isn't likely to have > future problems easily. A full editor review to decide if the add-on > meets AMO's quality standards for a public add-on. > \/ > * Stable phase > - File may be moved into stable release channel > - Gets green background and button > - Checkbox goes away
> The concept of a trust rating could still be very useful to sort the > queue of add-ons waiting for phase II editor review so that more > trusted add-ons get checked out sooner. Less trusted add-ons would > wait in the queue longer, presumably to build up trust.
This sounds a good idea generally. Just one point - in the spec for the Add-on Verification Tool (http://docs.google.com/View? docid=dcfr9qrp_1c2pgcsfh) it was suggested to flag use of eval. This is definitely a good idea (to make it harder to execute remote code, etc), but I'd like to make sure this isn't an absolute restriction. In particular, [[ eval("func="+func.toString().replace(/ oldText/,"newText")); ]] is the standard technique for dynamically patching Firefox UI code, and is safe because it only evaluates known code. (While such patching should be avoided where possible, advanced extensions often have no choice to achieve the functionality they need).
On Jul 4, 6:04 am, Andrew Williamson <evil.j...@yahoo.com> wrote:
> The 2 phase review idea is interesting but I'm not sure it would speed up the > process as much as you think. Currently, for most addons its the code review > which takes the time and the testing is just a quick check to make sure its main > function appears to work (the developer & experimental downloaders of having > tested it thoroughly before its nominated). Having to check the code twice > (once in phase1 and then again in phase2) will easily end up taking more time > overall for a single addon.
I'm not just suggesting that the code and overall reviews be split up; the phase 1 review is intended to be a faster code review. It'd be only for showstopper issues. One way to do it would be to check primarily issues flagged by the automatic checker as possible security/ stability issues. (yes, that would require writing a very good autochecker) The point here is to not have the editor scour over the code in-depth, but rather just scan for things that are big no-nos. Just enough review to be reasonably assured that it's remotely safe and then the phase 2 review for public status would catch the rest.
> So if the majority of add-ons only go to phase1 you'll get a small speed up. If > most developers don't want an 'unstable' tag next to their work (most developers > currently don't want an 'experimental' tag; hence the amount of submissions) > then the overall workload will be at best roughly the same.
That's why I also suggested user surveys. This offloads some of the work in screening out add-ons that aren't ready yet. Many of the add- ons that aren't going to be suitable for public status will avoid having to waste time with a phase 2 review because user surveys would come back as insufficient. Now, of course, this would also require coming up with some suitable user survey system that we can actually rely on, but it could be potentially helpful to implement. This concept is basically why user reviews used to be required, but if it was done in a formalized survey instead of a generic review I think it could be made useful again.