Removing the Sandbox

3 views
Skip to first unread message

Justin Scott

unread,
Jul 1, 2009, 6:37:10 PM7/1/09
to
The "Sandbox Model" addons.mozilla.org uses to organize and review
add-ons was first announced almost 3 years ago
(http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/).
Since then, we've made a number of changes based on user feedback that,
in my opinion, have greatly improve the experience of finding and
installing add-ons that haven't been officially reviewed yet.

Today, the main feedback concerning the review and distribution process
of add-ons is:
* developers feel it takes too long for add-ons to be reviewed, and
* users and developers want to receive updates to add-ons that they have
installed that haven't been reviewed yet

It's important for us to balance our desire for all add-ons to be
discoverable and easy to install with the need for security measures for
add-ons that haven't been reviewed yet.

After taking many of these issues into account, I've come up with a
proposal for removing the public and sandbox classifications on the site
and moving to a more flexible, comprehensive trust system based on
everything we know about an add-on. If you're interested in the review
process and distribution of add-ons, please read the proposal below and
give us your feedback.

Proposal:
http://docs.google.com/View?docID=dfntthnr_0f3wtksf2&revision=_latest

Thanks!

Justin Scott

Cesar Oliveira

unread,
Jul 1, 2009, 7:06:05 PM7/1/09
to
Another trust modifier could be how active the add-on is. An add-on
that is updated often would benefit more from the the trusted status.
Though, i don't see a difference between trusted status and public
status with the removal of the sandobx (maybe it is mentioned and I
just missed it).

The new trust system seems like a positive new change to AMO. I look
forward to the day where I can have updates being distributed without
having going through editors.

ashughes

unread,
Jul 1, 2009, 7:09:16 PM7/1/09
to

Having just read the proposal, I like it a lot. Not only will it get
add-ons to the people faster, it should also provide a more accurate
and knowledgable "shopping" experience. I noticed the Trust
Indicators, but I didn't see anything about user reviews and press.
While these might be a bit more edge case in their occurrence, I think
they are quite valid in determining the trust of an add-on.

All in all, great proposal. Keep up the great work!

Cheers,

Anthony Hughes (:ashughes)

Mike Bellwick

unread,
Jul 1, 2009, 7:31:38 PM7/1/09
to

Hi, just my 2 cents:

A mandatory code review is the only way to keep malware at bay in my
opinion. Everything else can (and will) be cheated.

So I'd keep the current system, but would add the ability for users to
find new or experimental addons more easily, for example by
introducing an "Experimental" category, or by providing new search or
view modes like "only recommended addons", "new addons", "all public
addons", "only experimental addons", "all addons" and to sort by
rating, popularity, age, application version, number of reviews etc.

That way it's not as bad being experimental since there might be
people who are actively searching for new stuff to try out. Currently
it is pretty difficult finding experimental addons without the editor
review queue.

Mike

Zadkiel

unread,
Jul 1, 2009, 8:42:58 PM7/1/09
to
This is pretty spiffy Justin :)

Cesar's suggestion of author activity is a good trust modifier.
It might prevent addons from being abandoned.

Anthony's suggestion of also including the user reviews and press
would also be nice but how would we keep track of that information?

Trust score possible issue is that as we are tweaking it
to prevent gaming these adjustments might take
an addon from public to sandbox or vice versa.

Also we might get into addon wars...where one addon
user base will purposefully try to lower a Trust score of another.

Trying to figure out how to curb the gaming is going to be an
interesting exciting process...
especially if the Trust Score system is going to be transparent.

Morac

unread,
Jul 2, 2009, 1:23:55 AM7/2/09
to
While it would be nice to be able to speed up the review process (one
add-on of mine took about 8 months for the initial review), I do think
this could be misused if not implemented properly. There have been
examples of "trusted" add-ons doing untrusted things.

Regarding code reviews, except perhaps on the initial update, are they
even done currently? I know a number of add-ons (including mine)
which appear to be reviewed simply by seeing if they install. No
actual testing is done, let alone a code review. I've actually had
releases with fairly major bugs in them which I missed before the
release went public. Unfortunately, while the buggy version got
pushed out within a day, the fixed version took over a week to get
reviewed.

I think a good potential trust modifier could be a percent change from
the previous version. I'm not even sure if this is something that
could be measured automatically or not, but I don't see why a minor
release for an add-on should take as long (or longer) to review as a
major one.

Erwan

unread,
Jul 2, 2009, 4:37:50 AM7/2/09
to
Hi,

I have to say this would be great, and can smooth the process a lot.
The implementation will make a big difference, however.
* The 1 - 4,999 bucket (with no installation) should really be a
transitional one, because not having updates is problematic for users.
In other words, if it is as hard as passing 5,000 than it is today to
get out of the sandbox, the problem will stay.
* I understand that the 0 or less is really for extensions who behaved
badly and get punished. But we have to make sure that once an
extension get there, it doesn't get stuck forever. There is a risk
that this happens because if it can't be installed it won't have users
(no point), so it won't get rated (no point) so it have little chance
to rise even if the author fixed the problem that caused the
punishment. One option would be to have it blocked for one month, then
back to 1 point. Unless of course getting under 0 is only for
incomplete add-ons.

Also, that may sound obvious but why not add some points for
extensions that are or have been recommended by AMO? That's definitely
a proof of trust.

It's really cool to hear about release channels also, it's something
developers have been waiting for a long time.


Erwan

Dave Townsend

unread,
Jul 2, 2009, 4:56:11 AM7/2/09
to

This looks really good. Is there a mapping that shows which of the trut
states would show up in the search results in the Firefox add-ons
manager? I guess I'm assuming only the 10,000+ ranges since we have no
checkbox there as such.

Is the trust score for the add-on as a whole or just a single version of
the add-on? If the latter then have you considered a kind of trust
cascade system where a newly posted version of an add-on inherits a
percentage of the trust of the old version. This could effectively mean
that an update for an add-on that already has a very high trust score
could become public and arrive in user's applications very quickly.

Gervase Markham

unread,
Jul 2, 2009, 6:20:03 AM7/2/09
to
On 01/07/09 23:37, Justin Scott wrote:
> Proposal:
> http://docs.google.com/View?docID=dfntthnr_0f3wtksf2&revision=_latest

I think that designing trust systems that are hard to game is a really,
really difficult problem, and you need to talk to people who've done it
before you try :-) Advogato (http://www.advogato.org/) was an early
platform for this sort of research.
http://www.advogato.org/trust-metric.html

Let's adopt the perspective of a bad actor who wants to get a malicious
extension onto AMO so people will install it and he can steal their data
or control their machine, and go through the proposed inputs and have a
think about which can be gamed, and how. Of course, the more metrics
there are, you have questions about how the scores combine. If scores
are, say, added, a bad actor may only need to game a few metrics to get
their scores above whatever the magic figure is.

# Editor Review - an editor's assessment of the add-on
- No more gameable than it is now, although editors may do less
detailed work if they are relying on the trust system, and if the
aim of the exercise is to reduce the amount of editorial control
needed

# Active Users - the number of users who have the extension installed
- Presumably measured by update pings? Very easily gameable.

# Ratings - the Bayesian rating of an add-on based on all user reviews
- Given that we don't control accounts very well, this would be
fairly easily gameable too - just robot in good reviews.

# Flags - the number of times a user has flagged the add-on as a
violation (to be implemented)
- Not gameable, as the bad actor cannot reduce the number. But of
course you need to implement it. And if people's data is being
stolen or their privacy is being violated, they may well not notice
so they won't flag it. This is also an "enumerate badness" model.

# Add-on Verification Tool - automated check of add-on packaging,
adherence to policies, and common security problems (to be
implemented; see spec)
- Given that the tool will be free software, malware can be written
to pass the checks. Given JavaScript's ability to create code
from strings, I suspect it's very hard to write a full fidelity
code checker. This is the halting problem.

# Support Information - does the author provide a support URL or e-mail
address?
- Trivially gameable.

# Other Add-ons by the Developer - how much do we trust the other
add-ons this developer has made?
- Gameable only in that we can apply the same gaming tactics to the
other add-ons.

In the thread, other people have suggested:

# How active the add-on author is (Cesar Oliviera)
- Easily gameable.

# % change from previous version (Morac)
- Easily gameable. Make most of the changes you want in a previous
version, then do a small update which just enables the nastiness.

All in all, not a great result for non-gameability.

I would suggest that a better approach would be to trust people, in an
Advogato-like model, and have that trust flow through to extensions and
other people only so far as the trusted people are willing to endorse
them. This is sort of like a modified version of the current system,
which is effectively that extensions go from 0% trusted to 100% trusted
upon the endorsement of a single reviewer. Instead, we could encourage
add-on authors to become part of the trusted, reviewing and endorsing
community, and express their preferences for trustworthy addons on the
site. The value of their endorsement would depend on their
trustworthiness, which would depend on the trustworthiness of those who
trust them, and so on. The "trust anchors" would be the existing reviewers.

Gerv

Robert Kaiser

unread,
Jul 2, 2009, 7:03:35 AM7/2/09
to
Gervase Markham wrote:
> # Flags - the number of times a user has flagged the add-on as a
> violation (to be implemented)
> - Not gameable, as the bad actor cannot reduce the number.

Easily misusable though by people who want to make life harder for their
"competitor" add-ons.

I think we need to care of people wanting to game up their own score as
well as one trying to game down scores of others (either because of
competition or because of personal dislike or such).

Robert Kaiser

Robert Kaiser

unread,
Jul 2, 2009, 7:06:53 AM7/2/09
to

What I like a lot in there is the concept of update channels coming to
AMO with it. I talked to and mailed Rey Bango a while ago about a few
add-ons (which probably would be in a trusted state - language packs,
lightning, i.e. things coming from our own Mozilla repos) that would
even like to provide nightly updates through AMO if possible. Update
channels are one thing needed for those as well (would be a "nightly"
channel there, but the basically same strategy).

Robert Kaiser

lith

unread,
Jul 2, 2009, 8:18:22 AM7/2/09
to
> # How active the add-on author is (Cesar Oliviera)

I don't quite see why activity alone should be a thrust modifier. If a
project is updated every day, it could also mean that the code isn't
properly tested and that the developers have no plan where they are
heading to.

Some of the addons I consider essential haven't been updated in years
because they simply work as promised by its developers.

Gijs Kruitbosch

unread,
Jul 2, 2009, 9:33:39 AM7/2/09
to

While I agree that the current situation is (rightly) perceived as problematic
by (some) add-on authors, I am not sure this is the right solution. Gerv has
already provided a good summary of how gameable this system is, and I would tend
to agree with what he has said. Although the happiness of our add-on authors
and editors is important, I think the happiness of our end users is still more
important, and a system that is as gameable as this will open the door
significantly wider than it is now in terms of allowing malicious add-ons.

I will add that malice and user trust is not everything, so I would be skeptical
even of the system Gerv suggests: although I do not have statistics (could we
get some?) I believe a fairly significant number of the add-ons/updates we
reject are rejected based on security issues, and a literally huge number of
rejections are based on not namespacing/wrapping code in the browser window. In
the first case, the add-on will work extremely well for all users - until they
join an insecure wireless network that exploits the security issues in the
add-on (eval-ing source procured over http, or similar issues). In the second,
it will work well until they install some other add-on which defines functions
with the same name, after which one will stop working -- diagnosing these issues
is a massive headache. Like Gerv, I doubt whether any of this could easily be
caught by automated tools (although "mistakes" are definitely easier to find
than purposeful malice).

In the end, I think code review by editors is one of the best ways we have of
protecting our users, and I think that that is more important than the
discomfort of add-on authors or editors.

Cheers,
Gijs

Co-developer of Venkman and ChatZilla, author of Chrome List, AMO editor.

Mike Bellwick

unread,
Jul 2, 2009, 12:38:44 PM7/2/09
to

Hi,

Gervase did a nice job of summarizing how gameable such a system would
be.

Now, if the aim is to speed up the review process and make life easier
for editors, while at least not sacrificing the security or privacy of
add-on users, I would propose the following:

- Provide a technical feedback channel for addons, just like the user
reviews, but for reporting defects, breaches of privacy policy,
copyright violations etc. But also for positive feedback like "works
well on MacOS X". These feedbacks should be viewable by the author and
editors, and maybe (should be open for discussion) by normal users.
Technical feedback could be categorized into: Bug reports, feature
requests, security bug reports, breaches of privacy policy and reports
of successful usage scenarios. This could be used to calculate a
functional trust score.

- Technical feedbacks should be moderatable like reviews. Resolved
bugs, feature requests or false technical issues should not negatively
impact an addon for long. Maybe remove old technical feedbacks a week
after an update.

- Make it the primary job of the editors to do code reviews of Addons.
Keep that down to checks for security and bad coding practice (loose
globals, uses of eval, binary code, obfuscated js etc.).

- Addons that have not been code-reviewed stay in the sandbox

- Rely on the user base for functional testing and bug reports

- If an addon is code reviewed, has been downloaded a while from the
sandbox and does not have important technical defects reported, let it
go public..

- Every existing public addon should be able to provide a current
"beta" alongside the stable version. Users have to check a checkbox in
order to download the untrusted beta version, Separate technical
reports for the beta should be possible.

Any opinions on that ?

bye,

Mike

Ricmacas

unread,
Jul 2, 2009, 3:22:51 PM7/2/09
to
Let me also propose a heuristic engine to detect potential malicious
addons, although that would be sort-of complicated, but not certainly
impossible.
We could set some actions that are undesirable, like sending bookmarks
to the web in clear text or anything that could concern the privacy of
the user.
Basically, with this engine, we could stop a malicious update to a
trusted addon from happening.

Ohh, I know, I have too much imagination.

Andrew Williamson

unread,
Jul 2, 2009, 3:50:37 PM7/2/09
to

Hi,

I think we should definitely drop the terms public and sandbox from the site as
they've acquired negative connotations along the way, furthering the impression
that the first thing you have to do is get your add-on public (which often got
immediately rejected, discouraging further effort).

I still think code review is a vital part of protecting the less experienced
users from poorly written or malicious add-on though. The trust model, while in
an ideal world a good replacement, is inherently flawed as Gerv explained already.

I agree with what Gijs has written summarising the common reasons we reject
addon submissions and how a trust model based on popularity will never pick up
these reasons reliably.

The Trust Score would be a useful measure if it was advisory but I'm not
convinced any score should equal one proper code review. We could use the user
reviews and negative reports to provide the user a more rounded view of an
unapproved/experimental/unchecked add-on before they install it though:

e.g.
\/ This Addon has passed our automated Checks [what does this mean?]
\/ Other Users rate this Addon as Good (87%) [what does this mean?]
X We've not Checked this Addon to make sure its well written and not malicious
[what does this mean?]

Providing updates to unchecked add-ons should be possible but only if the user
understands and accepts it. Ideally the client (firefox) would prompt the user
in a different way to make it clear they were getting something unchecked also.

With some types of Add-ons, such as Themes, Language Packs and Search Plugins
though we could probably get away without any editor involvement because of the
lack of code.

thanks,
Andrew Williamson

(AMO Editor)

Zadkiel

unread,
Jul 3, 2009, 12:14:50 AM7/3/09
to
A way to maybe speed up addon reviews is that if authors are required
to provide some steps to test their addon and any necessary test files
or accounts or what not for testing.

We prevent robot ratings (or any type of robotness) by implementing
some sort of recaptchas.
We could do the same thing before installing/updating addons...it
might be a slight inconvenience having to
do recaptchas before updating or installing an addon.

We could centralise all addon author activity and support information
in AMO.
Maybe users can rate how good the support is?

I gotta agree the % changed is a weak metric for trust points
anyone can just change filenames around and spoof percentages
or add extra lines here and there.

I like the notion of having user dole out trust points on addons and
on other users.
So the more the user reviews addons the more points they have...but if
they give
poor reviews other users can give them a negative rating which would
give their
reviews less weight. It will be self-policing...but you could get a
black market
of a group of power users who just go around and take people and
addons out lol.

I think the verification tool is where the challenge lies in this
whole trust system.
We would need security checks to prevent authors from spoofing the
tool before
it does it job.
I can see Editors using this verification tool the most for some
reason...I think
it would help in cases where the authors nominated addon is just
massive.

We could rename Public to:
Colour + Vulpes? (Fox?)
ie Blue Fox or just Vulpes?
or
Gadget?

Sandbox to:
Sly Fox?
Kitsune?
Cog?

Maybe we can ask marketing to help us :P

William Gianopoulos

unread,
Jul 3, 2009, 3:00:07 PM7/3/09
to
On Jul 2, 6:20 am, Gervase Markham <g...@mozilla.org> wrote:
> On 01/07/09 23:37, Justin Scott wrote:
>
> > Proposal:
> >http://docs.google.com/View?docID=dfntthnr_0f3wtksf2&revision=_latest
>
> I think that designing trust systems that are hard to game is a really,
> really difficult problem, and you need to talk to people who've done it
> before you try :-) Advogato (http://www.advogato.org/) was an early
> platform for this sort of research.http://www.advogato.org/trust-metric.html

Lest anyone miscontrue my previous comments, I will try to make myself
100% clear.
The current system relies too heavily on easily fakeable positive
reviews of the extension.
A system that relies more on the reputation of the author would be
100% better than what we have now.
I am not saying that the proposed solution does this or is the correct
solution.

William Gianopoulos

unread,
Jul 3, 2009, 3:09:00 PM7/3/09
to
On Jul 2, 6:20 am, Gervase Markham <g...@mozilla.org> wrote:
> On 01/07/09 23:37, Justin Scott wrote:
>
> > Proposal:
> >http://docs.google.com/View?docID=dfntthnr_0f3wtksf2&revision=_latest
>
> I think that designing trust systems that are hard to game is a really,
> really difficult problem, and you need to talk to people who've done it
> before you try :-) Advogato (http://www.advogato.org/) was an early
> platform for this sort of research.http://www.advogato.org/trust-metric.html

Other issues with the current AMO approval process have to do with why
updated extensions for new release versions are e\not available until
after the release have to do with the current approval process. If
you try to release when you are asked you get to the end of some queue
which actually has to go through the entire defined review process.
If you wait till the new version of Firefox is released and then
submit your update as changes required to work with the new version it
seems to be just rubber-stamped.

DaveG

unread,
Jul 3, 2009, 3:15:59 PM7/3/09
to
A trust system could theoretically work, but I think it may be too
hard to balance and maintain in any reasonable way as described.
Personally, I like having to wait for a review of some kind; I like
that I have to be checked and approved by someone else staring at the
code to make sure everything is ok. I just hate the long time it
takes.

I think three things could help here:
1) Take better advantage of the addition of release channels.
2) Split editor reviews into two parts.
3) Formalize user feedback pending public status via a survey we ask
users to fill out.

My very rough suggestion:

* Submission phase
- All add-on files go through automatic security and consistency
checks. Failure gives automatic rejection. Success with or without
flags progresses.
\/
* Experimental phase
- New file is listed for unstable release channel
- No auto-updating
- Add-on gets red background and red button
- Checkbox and popup "I know this is unreviewed and may explode" box
\/
Phase I editor review
- Check for showstopper issues only, namely security problems, high
stability risks, and policy violations. Don't care if it's full of
other bugs or potential issues. (though, if any are found the
developer should probably be notified that they'll be rejected in
phase II) This check should take significantly less time and effort
than a full review, and thus can be done more frequently and sooner.
\/
* Unstable phase
- Auto-updating within unstable release channel allowed
- Add-on gets orange background and orange button
- Checkbox and light warning
\/
User surveys
- After install of any unstable add-on, AMO will automatically ask the
user to come back later and fill out a quick survey. (results viewable
by developers and editors) If after some wait period no explosions are
reported, proceed to next editor review.
\/
Phase II editor review
- Detailed check of code and function for all issues. Make sure it
works fully, lacks any noticeable bugs, and isn't likely to have
future problems easily. A full editor review to decide if the add-on
meets AMO's quality standards for a public add-on.
\/
* Stable phase
- File may be moved into stable release channel
- Gets green background and button
- Checkbox goes away

The concept of a trust rating could still be very useful to sort the
queue of add-ons waiting for phase II editor review so that more
trusted add-ons get checked out sooner. Less trusted add-ons would
wait in the queue longer, presumably to build up trust.

seppo.ka...@gmail.com

unread,
Jul 4, 2009, 5:36:07 AM7/4/09
to
Another trust modifier: how long has the add-on / it's author been
around?

You should let users decide which add-ons they want updated
automatically.
And that should be trust modifier too. Add-ons window could have check
box for automatic updates beside every extension. Trust score only
sets the default for this. Maybe a warning can be displayed when user
sets automatic update for low score add-on, but he/she should still be
able to set it and get it. Not enabling this is to make users life
harder.

We still have to go through editors. I liked DaveG's idea about
splitting editor reviews into two parts. I suggest that code reviewers
could have more points, more weight for their non-code reviews also,
to encourage code reviewing.

Andrew Williamson

unread,
Jul 4, 2009, 5:47:32 AM7/4/09
to
On 03/07/2009 20:00, William Gianopoulos wrote:

> Lest anyone miscontrue my previous comments, I will try to make myself
> 100% clear.
> The current system relies too heavily on easily fakeable positive
> reviews of the extension.
> A system that relies more on the reputation of the author would be
> 100% better than what we have now.
> I am not saying that the proposed solution does this or is the correct
> solution.

We don't require reviews of extensions any more to gain public status (we
haven't for a few months).

Andrew Williamson

unread,
Jul 4, 2009, 6:04:46 AM7/4/09
to
The 2 phase review idea is interesting but I'm not sure it would speed up the
process as much as you think. Currently, for most addons its the code review
which takes the time and the testing is just a quick check to make sure its main
function appears to work (the developer & experimental downloaders of having
tested it thoroughly before its nominated). Having to check the code twice
(once in phase1 and then again in phase2) will easily end up taking more time
overall for a single addon.

So if the majority of add-ons only go to phase1 you'll get a small speed up. If
most developers don't want an 'unstable' tag next to their work (most developers
currently don't want an 'experimental' tag; hence the amount of submissions)
then the overall workload will be at best roughly the same.

TheHalx

unread,
Jul 4, 2009, 7:28:39 AM7/4/09
to
This sounds a good idea generally. Just one point - in the spec for
the Add-on Verification Tool (http://docs.google.com/View?
docid=dcfr9qrp_1c2pgcsfh) it was suggested to flag use of eval. This
is definitely a good idea (to make it harder to execute remote code,
etc), but I'd like to make sure this isn't an absolute restriction. In
particular, [[ eval("func="+func.toString().replace(/
oldText/,"newText")); ]] is the standard technique for dynamically
patching Firefox UI code, and is safe because it only evaluates known
code. (While such patching should be avoided where possible, advanced
extensions often have no choice to achieve the functionality they
need).

Dave Garrett

unread,
Jul 4, 2009, 9:48:48 AM7/4/09
to
On Jul 4, 6:04 am, Andrew Williamson <evil.j...@yahoo.com> wrote:
> The 2 phase review idea is interesting but I'm not sure it would speed up the
> process as much as you think.  Currently, for most addons its the code review
> which takes the time and the testing is just a quick check to make sure its main
> function appears to work (the developer & experimental downloaders of having
> tested it thoroughly before its nominated).  Having to check the code twice
> (once in phase1 and then again in phase2) will easily end up taking more time
> overall for a single addon.

I'm not just suggesting that the code and overall reviews be split up;
the phase 1 review is intended to be a faster code review. It'd be
only for showstopper issues. One way to do it would be to check
primarily issues flagged by the automatic checker as possible security/
stability issues. (yes, that would require writing a very good
autochecker) The point here is to not have the editor scour over the
code in-depth, but rather just scan for things that are big no-nos.
Just enough review to be reasonably assured that it's remotely safe
and then the phase 2 review for public status would catch the rest.

> So if the majority of add-ons only go to phase1 you'll get a small speed up. If
> most developers don't want an 'unstable' tag next to their work (most developers
> currently don't want an 'experimental' tag; hence the amount of submissions)
> then the overall workload will be at best roughly the same.

That's why I also suggested user surveys. This offloads some of the
work in screening out add-ons that aren't ready yet. Many of the add-
ons that aren't going to be suitable for public status will avoid
having to waste time with a phase 2 review because user surveys would
come back as insufficient. Now, of course, this would also require
coming up with some suitable user survey system that we can actually
rely on, but it could be potentially helpful to implement. This
concept is basically why user reviews used to be required, but if it
was done in a formalized survey instead of a generic review I think it
could be made useful again.

John J. Barton

unread,
Jul 4, 2009, 11:30:33 AM7/4/09
to

I'm a great fan of both eval() and of patching other people's Javascript
code by setting properties (sometimes called "monkey patching"). There
is no security problem with eval() in extension code, unless the string
being eval-ed is from a web page; ditto monkey patching.

However, the technique of monkey patching by *editing* someone else's
code seems like a bad choice. It creates "automatic bugs" because the
developer is relying on the text of "func" to be "the same" in some
undefined way as the text of "func" when they read the code. As the base
code evolves, it's not possible to predict what the edit will do.

I would urge anyone reviewing code to suggest the alternative of copying
and editing the source of 'func'.

jjb

Archaeopteryx

unread,
Jul 7, 2009, 10:02:17 AM7/7/09
to
Phase 2 seems pretty flawed because users will often submit feedback
directly after install.

Generally, the resource "reviewer time" is limited, so a few things emerge:
- reviewing new extensions and updates which offer compatibility with
new stable application versions should get priority (other updates can
be offered by the "untrusted" released channel)
- not all extensions will get code reviews - the user has to decide what
he installs. It is the same as downloading a zip or install exe from a
web page, the user has to check what he trusts.

Perhaps deeper changes to the Firefox back-end are needed where
extensions should run in sandboxes and the user can be asked to grant
submissions for the add-on like
- altering web pages (maybe only for a host/domain)
- accessing bookmarks
- accessing the file system
- accessing the password manager
- ...

Furthermore, testing needs more time here because I try to test
everything. If there are tiny extensions which do only one or few
things, 5 reviews per hour are possible, but often less.

A metric which provides information about extension problems can be the
_increase_ of extension disabled count (of course this could also be
gamed by competitors, so only as advice).

Brian King

unread,
Jul 7, 2009, 10:13:47 AM7/7/09
to
Archaeopteryx wrote:
> Perhaps deeper changes to the Firefox back-end are needed where
> extensions should run in sandboxes and the user can be asked to grant
> submissions for the add-on like
> - altering web pages (maybe only for a host/domain)
> - accessing bookmarks
> - accessing the file system
> - accessing the password manager
> - ...

In an ideal world, users would be aware of what is going on under the
hood in add-ons, but in reality would eb very quickly overwhelmed by all
the information. Firefox would quickly turn into Windows Vista with all
the alerts! I doubt if many care anyway.

--
Brian King
Need free Mozilla project hosting?
http://mozdev.org

Jeff.tet

unread,
Jul 7, 2009, 4:26:08 PM7/7/09
to

After reading all of the posts here and going back to look at the
proposed, My suggestions would be:

* new addon - MUST HAVE KNOWLEDGEABLE HUMAN REVIEW.

* addon updates - just submit the update to be reviewed. Updates can
be submitted and reviewed by a software review system that cross
references the original source code. Updates can be divided into
minor revisions 0.1.1 or major revisions 1.x or they can also have
updates submitted as beta testing only before final submission.

* anything rating or trust can be manipulated Rating systems/surveys
can be good for end users to see what people like or not like, just as
I would shop for a tv or anything substantial I would do my research
and part of that would be owner/user reviews. This should only carry
little weight if any to AMO.

* bug submissions or functionality problems should be separate from
rating systems/surveys and should be reviewed by AMO at final red
carpet approval.

* addons/extensions should be entered into a build system whether they
are actually designed there or not and submit addons from their. From
their say an extension could be approved for beta status for users to
take part in testing.

* themes can take a much more relaxed position and should be checked
for bugs/css/xul errors.

Keeping everyone happy all the time is impossible!

AJ

unread,
Jul 9, 2009, 11:41:50 AM7/9/09
to
In general I guess I'd like to see anonymity and subjectivity removed from
the approval process and placed in the proper context with the app as
reviews attributed to specific reviewers. The approval process should focus
on reliability, security and compatibility.

As to a trust score system: Narrow market add-ons, tailored for use to a
small user community can suffer penalties due to the lack of editorial
expertise, and the lack of a large user base if points assignment isn't
balanced. A percentage system may be more useful than a threshold system.

Ratings are generally useless for small samples. Once bad rating to a low
volume app can have dramatic effect, while for a popular app it is average
into the stats. Ratings should be ignored and not displayed to users until a
significant pool threshold has been reached. The Windows Live Gallery has
this problem where ratings are used almost exclusively to rank app
popularity and are commonly manipulated to promote and attack new apps. The
same applies to flagging. Craiglist sees flagged listings quite often where
people perceive the style of a posting in a certain way and flag it only to
later allow it when the poster complains about inapproriate flagging in
their next post. (e.g. someone sells their item and offers to ship it and
gets flagged because people think they are a non-local commercial seller).
This sort of flagging, mistaken or not, shouldn't be permitted to exclude
legitimate apps.

Personal preferences greatly affect ratings without providing any indication
of the bias. I have no use for weather add-ons: I can either refrain from
rating or go in and bad-rate all the weather apps. Excluding me from ratings
calculations is a good idea (rate the rater). I've seen users on web review
systems rate five different companies negatively in one day and never post
another rating. That's just a bad day lashing out at the world..

There's no security factor. How are hostile add-ons excluded? As a user, I
could care less about an editor's or reviewer's opinion of an app. If there
are 200 bad reviews, I pay attention. If I want it or need it, their opinion
is irrelevant, I try it anyway. My expectation from a system that evaluates
itself on the basis of trust is that I as a user can extend that same trust
to the app. This implies that the app won't try to harm my computer, harvest
my info, etc. Perhaps this reaches into the browser iteself, rather than
just the AMO.

I'd like to see subjectivity removed from the editorial process. One problem
with the editor review process is personalization of the review. Editors
that don't have personal need for a particular add-on are less likely to
understand the nuances of the add-on. If the reviews are limited, the editor
now needs to make an evaluation of the product from their own experience.
Users reviews are currently limited by the sandbox design which is a barrier
that usually prevents a user from overriding their apathy long enough to
login.

I've had multiple add-ons that perform a similar general function, but have
specific presentation methods or customizations to the needs of different
users, rejected by an editor as too similar. While apps that are nearly
identical but written by different developers pass thru the process without
the comparison to other existing apps. This is a case of different standards
being applied to different developers.


EnvironmentalChemistry

unread,
Jul 18, 2009, 11:55:45 PM7/18/09
to
Along the lines of what Jeff.tet suggested, one thing that could help
speed up the review process and reduce reviewer work load is to have
automated processes do file compares between existing approved add-ons
and their updates. For instance, add-on that only have modifications
to the install.rdf file, CSS files and/or language localization files
might not need to be reviewed at all because these types of changes
pose a very low security risk. This automated file compare could also
allow updated add-ons to be prioritized with those add-ons with minor
changes being given to junior reviewers while add-ons with heavy
changes are passed on to more senior evaluators.

Bernat

unread,
Jul 24, 2009, 8:05:27 PM7/24/09
to
I agree that user rating shouldn't affect trust, sometimes users rate
upon their personal preference over other add-ons, or it can be
misused by competing add-on fans/authors.

Also, I think automatized add-on reviews should be taken more
seriously, specially dictionaries and language packs are good
candidates for this. Today, language packs and dictionaries take an
unacceptable time to review even when they're just updates. Doing this
process automatically would catch mistakes better and save a lot of
human work.

I have a language pack update for Firefox 3.5 waiting for approval for
some weeks and users have to stick to Firefox 3.0 still, or they need
to find out there's a language pack for Firefox 3.5 hidden somewhere.

NettiCat

unread,
Jul 24, 2009, 8:35:22 PM7/24/09
to
I think code review by editors is indispensable as most other 'trusted
score' methods are easily to fake.

Suggestions

1. Limit the regular update interval of addons:
There are addons that release updates every 2 weeks but still get
reviewed. This binds enormous editor capacities and prevents other
addons from being reviewed. Thus I propse to limit the update interval
of addons to e.g. every 6 months as minimum. Allow necessary extra
updates, but only if the author can assure that it is e.g. an
essential security update. If the author repeately abuses extra
updates ignore them for a time.

2. Get more editors by offering a small fee for every review.
Officially an editor gets nothing in return for doing the review work.
I would consider to become an editor if the work would be rewarded
that way.

Message has been deleted
Message has been deleted

NettiCat

unread,
Jul 24, 2009, 10:07:31 PM7/24/09
to
Addendum:

The name 'experimental add-on' really has a bad taste:
It suggests that the addon author makes experiments with the user's
browser.
This usually does not apply because addon authors do everything to
release working and bug-free addons and even version 1.0 is a final
release.

A more descriptive name would be better, for example:
'Yet not approved by AMO editors'

Victor.Kotov.Sibers

unread,
Jul 27, 2009, 4:51:07 AM7/27/09
to
Hello Justin,

On Jul 2, 5:37 am, Justin Scott <flig...@mozilla.com> wrote:
> If you're interested in the review
> process and distribution of add-ons, please read the proposal below and
> give us your feedback.
>

> Proposal:http://docs.google.com/View?docID=dfntthnr_0f3wtksf2&revision=_latest
>
> We are especially interested in feedback on the following:
>
> * What other add-on trust modifiers are there?
> * Will this new system help end users? developers?
> * Is the "Trust Score" the right way to do this?

First of all thanks, that's great idea and it should goes live
absolutely!

I think the new system will affect a developers first, but as more and
more users will have the ability to try plugins and provide feedback,
the system will work for both sides :)

Looking forward to seeing it alive.

World Star

unread,
Jul 28, 2009, 3:08:38 AM7/28/09
to
On Jul 2, 6:37 am, Justin Scott <flig...@mozilla.com> wrote:
> Proposal:http://docs.google.com/View?docID=dfntthnr_0f3wtksf2&revision=_latest
>
> Thanks!
>
> Justin Scott

The new approach suggested in the proposal sounds great and I have a
few comments on it.

***** A non-exploitable or non-gamable system is ACTUALLY NOT
preferred *****
First of all, we should be aware that a non-exploitable or non-gamable
system shouldn't be what we should attain. Ideally every single update
has to be code reviewed because there is always a chance this is a
malicious update, no matter how small it is. However it would
dramatically increase the workload and make the process ultra-slow
which is just impracticable. Security implies inconvenient and less
efficient. We can't get the best of both. They key is to find a good
balance - a realistically ideal balance. Don't waste time developing a
non-exploitable or non-gamable system. It doesn't exist nor necessary.

Correct me if I'm wrong. I never hear that a malicious addon has been
successfully bypass the current review checks and slip into the
public. Even though the current process is non-exploitable it doesn't
matter. What it matters is it's realistically adequate to drive the
malicious addons away.

So our focus should be how we should increase the speed of our process
while maintaining satisfactory level of security.


## Check "Auto Approvals" ##
An author is trustworthy four years ago doesn't mean it always remain
trustworthy now. Take a look at Wikipedia. There is one Wikipedia
admin who found to be trustworthy several years was caught abused its
power one day. We do see cases where a (very) trusted person can turn
into untrusted so there should be some measures we can implement to
prevent this from happening. "Auto-approval", while very hard to get,
is a potential devil which lures the developer to do bad things. "Once
trusted not checking anymore" may tempt the developer to add some
controversial codes into the trusted addon silently because they know
for sure the code won't be checked before going into public. It has
happened at least once on AMO as far as I know.

As a minimal measure, between the process of "Trusted Add-on?" and "Go
Public", I hope that editors would at least still periodically check/
review the code to see if a developer has abused the "trust".


## Automatic Phase ##
As said above "auto approval" is a bad bait which may lure someone
into being less trusted one day. "Auto approval" is a good way to save
time and lower workload but a bad way in terms of security. I would
propose "automatic phase" as a replacement of "auto approval". It's
how it works.

First of all you have to gain a minimum level of trust score to be
eligible for the "automatic phase". It works as follows:

Trust level ============ Automatic Approval after...
Very low to low __________ not eligible. All updates must be checked
and manually approved.
Medium ________________ XXX days (longest)
High __________________ XX days
Very high _______________ X days (shortest)
(The exact time is not disclosed to make it harder to game the system
by vagueness)

The idea behind this system is we place less time on those we have
more trust on. The time saved in checking low-risk thing can be re-
allocated to processing more new submissions and updates on less
trusted addons. The whole process would be faster while similar level
of security is still attained.

The concept behind this system is simple. We still check trusted
addons and updates as many as possible but we automatically approve it
if it waits too long in the queue. This is a way to achieve a good
balance of "efficiency" and "security".

Another benefit of this system is that we no longer grant absolute
auto approval to anyone, no more "bad baits" to lure good guys into
bad guys. Every time you submit an update there is still a chance that
your update will be checked. We just don't do it if we are too busy at
other stuff. This would let them think twice when they want to push
controversial codes in an update.


## Update with Checkbox ##
While the update is pending is it possible to let user choose to
install the updated version anyway? For example BetterPrivacy is still
stuck at 1.29 on AMO while the latest version is already 1.40. It
would be nice to notify users that an updated version is indeed
available but is still under review/sandbox. Tell them the risks and
tick the checkbox if they decide to trust it anyway.

No updates available for addons with "sandbox" status can be a
problem. "Updates with checkbox" come into handy for people who know
the risks involved but still want to try it anyway.


## 10-point scale Rating System ##
I find the rating system much less useful because there are so many
addons now and there are too many addons in the same rating. It makes
searching for a better addon by rating much less desirable. I would
like to see a 5-point scale rating system changed into 10-point scale
rating system. It could be a new 1,2,3...10 rating system, or 0.5,
1.0, 1.5, 2.0...5.0 rating system.


## Technical Review ##
For anyone who have the technical knowledge to do code review but
isn't part of the team we welcome them to submit their code review,
report problematic codes, verify if the code is clean. It shouldn't be
just a simple statement that the code is clean. There should be a
pretty long form, similar to bugzilla, which asks them some questions.

Justin Scott

unread,
Jul 29, 2009, 3:56:02 PM7/29/09
to
Hey everyone,

I wanted to give a brief update on this.

I've read through everyone's feedback and it's been very helpful in
identifying the areas that we need to refine in the proposal. I'm
actively working on several other AMO projects that will be implemented
before this project would, so they have had most of my attention.

As soon as I can, I'll respond in more detail to the points raised and
include a revised proposal.

Thanks for your feedback and patience!

Justin

Justin Scott wrote:
> The "Sandbox Model" addons.mozilla.org uses to organize and review
> add-ons was first announced almost 3 years ago
> (http://blog.fligtar.com/2006/11/21/reviewing-the-review-process/).
> Since then, we've made a number of changes based on user feedback that,
> in my opinion, have greatly improve the experience of finding and
> installing add-ons that haven't been officially reviewed yet.
>
> Today, the main feedback concerning the review and distribution process
> of add-ons is:
> * developers feel it takes too long for add-ons to be reviewed, and
> * users and developers want to receive updates to add-ons that they have
> installed that haven't been reviewed yet
>
> It's important for us to balance our desire for all add-ons to be
> discoverable and easy to install with the need for security measures for
> add-ons that haven't been reviewed yet.
>
> After taking many of these issues into account, I've come up with a
> proposal for removing the public and sandbox classifications on the site
> and moving to a more flexible, comprehensive trust system based on

> everything we know about an add-on. If you're interested in the review


> process and distribution of add-ons, please read the proposal below and
> give us your feedback.
>

Reply all
Reply to author
Forward
0 new messages