Checking permissions while running reindex program?

77 views
Skip to first unread message

Saša Živkov

unread,
Sep 6, 2016, 12:03:21 PM9/6/16
to repo-d...@googlegroups.com
When (offline) reindex program is run Gerrit it will check permissions
when computing index fields. From what I saw in the debugger it will
set the change owner as the current user and will then evaluate permissions against
that user. Does that make sense? It is not only potentially expensive to check the permissions
but it also may affect the results if change owner doesn't have some permissions, right?

Shouldn't reindex rather use an InternalUser and avoid all permission checks?


Dave Borowitz

unread,
Sep 6, 2016, 12:24:54 PM9/6/16
to Saša Živkov, repo-d...@googlegroups.com
Which permissions are you specifically concerned about here?

There is at least one corner case in NoteDb where we do actually have to check permissions during reindex. Since approvals in NoteDb are not copied forward in storage at patch set creation time, we run the ApprovalCopier via ChangeData#currentApprovals. ApprovalCopier (currently) uses permissions to truncate approvals to the allowed range for that user.

For that case we probably don't actually care what the user is in the ChangeControl we pass to ApprovalCopier#getForPatchSet, since it turns around and calls controlFor(otherUser). But passing InternalUser won't actually avoid most of the ACL evaluations.

--
--
To unsubscribe, email repo-discuss+unsubscribe@googlegroups.com
More info at http://groups.google.com/group/repo-discuss?hl=en

---
You received this message because you are subscribed to the Google Groups "Repo and Gerrit Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to repo-discuss+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Saša Živkov

unread,
Sep 6, 2016, 4:29:11 PM9/6/16
to Dave Borowitz, repo-d...@googlegroups.com
On Tue, Sep 6, 2016 at 6:24 PM, Dave Borowitz <dbor...@google.com> wrote:
Which permissions are you specifically concerned about here?

Specifically, I have seen that computing mergeability checks Read permission on the target branch.
The code reaches RefControl.isVisible [1] and proceeds to canPerform(Permission.READ)
because getUser().isInternalUser() returns false.
I haven't yet checked if and what other permissions may also be evaluated. 

In general my concern was about any permission evaluation during indexing... IMO we don't need
it and if we cannot easily avoid calling code which does permission evaluation then using an InternalUser is
the intended way to avoid that evaluation, right?


There is at least one corner case in NoteDb where we do actually have to check permissions during reindex. Since approvals in NoteDb are not copied forward in storage at patch set creation time, we run the ApprovalCopier via ChangeData#currentApprovals. ApprovalCopier (currently) uses permissions to truncate approvals to the allowed range for that user.

For that case we probably don't actually care what the user is in the ChangeControl we pass to ApprovalCopier#getForPatchSet, since it turns around and calls controlFor(otherUser).

I understand. However, this is a completely different thing... it doesn't depend on the current user and it
does some permission evaluation for a user who posted an approval on that change, not for the current user.
Conceptually this is not really a part of reindexing but more a lazy-update, if I understand it correctly.
 
But passing InternalUser won't actually avoid most of the ACL evaluations.

Dave Borowitz

unread,
Sep 6, 2016, 4:40:17 PM9/6/16
to Saša Živkov, repo-d...@googlegroups.com
On Tue, Sep 6, 2016 at 4:28 PM, Saša Živkov <ziv...@gmail.com> wrote:


On Tue, Sep 6, 2016 at 6:24 PM, Dave Borowitz <dbor...@google.com> wrote:
Which permissions are you specifically concerned about here?

Specifically, I have seen that computing mergeability checks Read permission on the target branch.
The code reaches RefControl.isVisible [1] and proceeds to canPerform(Permission.READ)
because getUser().isInternalUser() returns false.
I haven't yet checked if and what other permissions may also be evaluated. 

Good point, that would be nice to short-circuit.
 
In general my concern was about any permission evaluation during indexing... IMO we don't need
it and if we cannot easily avoid calling code which does permission evaluation then using an InternalUser is
the intended way to avoid that evaluation, right?


There is at least one corner case in NoteDb where we do actually have to check permissions during reindex. Since approvals in NoteDb are not copied forward in storage at patch set creation time, we run the ApprovalCopier via ChangeData#currentApprovals. ApprovalCopier (currently) uses permissions to truncate approvals to the allowed range for that user.

For that case we probably don't actually care what the user is in the ChangeControl we pass to ApprovalCopier#getForPatchSet, since it turns around and calls controlFor(otherUser).

I understand. However, this is a completely different thing... it doesn't depend on the current user and it
does some permission evaluation for a user who posted an approval on that change, not for the current user.
Conceptually this is not really a part of reindexing but more a lazy-update, if I understand it correctly.

You're correct that it's for a user who voted on a label, but it really does have to do it during reindexing. In other words yes it's a lazy lookup but we do the lazy lookup when indexing a change.

ChangeField#LABEL looks up labels on the current patch set to store them in the index.
LABEL#get
-> ChangeData#currentApprovals
-> ApprovalsUtil#byPatchSet
-> ApprovalCopier#getForPatchSet
-> LabelNormalizer#normalize
-> LabelNormalizer#applyRightFloor
 
 
But passing InternalUser won't actually avoid most of the ACL evaluations.

It would avoid ACL evaluation at least in the example [1].
Any better ideas/proposals?

I do not have any ideas about how to avoid calling LabelNormalizer#applyRightFloor in the index path, sorry :(

One thing that will not work is indexing the un-squashed values for a user. Say a user voted CR+2 on PS1, then PS2 is a trivial rebase, then the user lost their permission to vote CR+2. Nonetheless NoteDb only stores the CR+2 on PS1. In the change screen (and more importantly in the submit rule evaluator), we will treat it as if the user voted CR+1. So it would be incorrect to return this change in a search for "label:Code-Review+2,user=theuser".

Dave Borowitz

unread,
Sep 6, 2016, 4:42:24 PM9/6/16
to Saša Živkov, repo-d...@googlegroups.com
I just realized we still have this exact problem even without NoteDb. In this scenario under ReviewDb, if we copy the CR+2 vote to PS2 before the user has their access revoked, it would still be lazily squashed to CR+1 even though what's stored in the database is still CR+2.

Dave Borowitz

unread,
Sep 6, 2016, 4:45:14 PM9/6/16
to Saša Živkov, repo-d...@googlegroups.com
Ugh. Except ApprovalsUtil#byPatchSet won't call ApprovalCopier#getForPatchSet in the non-NoteDb case, so actually ReviewDb is indexing the wrong value here and that's a bug IMO.

Saša Živkov

unread,
Sep 7, 2016, 9:56:05 AM9/7/16
to Dave Borowitz, repo-d...@googlegroups.com
We are discussing two topics here:
1. (unnecessary) permission checks during (re)indexing against the current user
2. normalization of the approval values which is done during (re)indexing and which also involves permission checks


On Tue, Sep 6, 2016 at 10:39 PM, Dave Borowitz <dbor...@google.com> wrote:


On Tue, Sep 6, 2016 at 4:28 PM, Saša Živkov <ziv...@gmail.com> wrote:


On Tue, Sep 6, 2016 at 6:24 PM, Dave Borowitz <dbor...@google.com> wrote:
Which permissions are you specifically concerned about here?

Specifically, I have seen that computing mergeability checks Read permission on the target branch.
The code reaches RefControl.isVisible [1] and proceeds to canPerform(Permission.READ)
because getUser().isInternalUser() returns false.
I haven't yet checked if and what other permissions may also be evaluated. 

Good point, that would be nice to short-circuit.

With the current logic of evaluating permission against the change owner we can even
end-up into failing to (re)index a change. This can happen in a setup where projects
are not generally visible to all users but only to project owners and when a change owner
leaves the team and looses Read permission on the project. In that case the visibility check
returns false.
NOTE: I haven't verified the above assumption yet.

Therefore, I think we must use InternalUser when reindexing.
  
 
In general my concern was about any permission evaluation during indexing... IMO we don't need
it and if we cannot easily avoid calling code which does permission evaluation then using an InternalUser is
the intended way to avoid that evaluation, right?


There is at least one corner case in NoteDb where we do actually have to check permissions during reindex. Since approvals in NoteDb are not copied forward in storage at patch set creation time, we run the ApprovalCopier via ChangeData#currentApprovals. ApprovalCopier (currently) uses permissions to truncate approvals to the allowed range for that user.

For that case we probably don't actually care what the user is in the ChangeControl we pass to ApprovalCopier#getForPatchSet, since it turns around and calls controlFor(otherUser).

I understand. However, this is a completely different thing... it doesn't depend on the current user and it
does some permission evaluation for a user who posted an approval on that change, not for the current user.
Conceptually this is not really a part of reindexing but more a lazy-update, if I understand it correctly.

You're correct that it's for a user who voted on a label, but it really does have to do it during reindexing.

Does it also has to be done when Gerrit is stopped and we do offline (re)indexing?
This seems not necessary if the approvals in NoteDb were already copied?
 
In other words yes it's a lazy lookup but we do the lazy lookup when indexing a change.

ChangeField#LABEL looks up labels on the current patch set to store them in the index.
LABEL#get
-> ChangeData#currentApprovals
-> ApprovalsUtil#byPatchSet
-> ApprovalCopier#getForPatchSet
-> LabelNormalizer#normalize
-> LabelNormalizer#applyRightFloor
 
 
But passing InternalUser won't actually avoid most of the ACL evaluations.

It would avoid ACL evaluation at least in the example [1].
Any better ideas/proposals?

I do not have any ideas about how to avoid calling LabelNormalizer#applyRightFloor in the index path, sorry :(

I am right now more concerned about topic 1 and I think we agree that using InternalUser instead of the change owner
as the current user when reindexing a change is a possible bug fix.

Btw, another issue with evaluating permissions during offline reindexing is that it can produce
tons of "Unkown GroupMembership for UUID: user%3A..." warnings if a group backend is contributed
by a plugin (singleusergroup plugin) and a permission is assigned to a group from that backend ("user/joe" for the singleusergroup plugin).
There will be one such warning for each change from that project.

Saša Živkov

unread,
Sep 7, 2016, 11:18:10 AM9/7/16
to Dave Borowitz, repo-d...@googlegroups.com
On Wed, Sep 7, 2016 at 3:55 PM, Saša Živkov <ziv...@gmail.com> wrote:
We are discussing two topics here:
1. (unnecessary) permission checks during (re)indexing against the current user
2. normalization of the approval values which is done during (re)indexing and which also involves permission checks


On Tue, Sep 6, 2016 at 10:39 PM, Dave Borowitz <dbor...@google.com> wrote:


On Tue, Sep 6, 2016 at 4:28 PM, Saša Živkov <ziv...@gmail.com> wrote:


On Tue, Sep 6, 2016 at 6:24 PM, Dave Borowitz <dbor...@google.com> wrote:
Which permissions are you specifically concerned about here?

Specifically, I have seen that computing mergeability checks Read permission on the target branch.
The code reaches RefControl.isVisible [1] and proceeds to canPerform(Permission.READ)
because getUser().isInternalUser() returns false.
I haven't yet checked if and what other permissions may also be evaluated. 

Good point, that would be nice to short-circuit.

With the current logic of evaluating permission against the change owner we can even
end-up into failing to (re)index a change. This can happen in a setup where projects
are not generally visible to all users but only to project owners and when a change owner
leaves the team and looses Read permission on the project. In that case the visibility check
returns false.
NOTE: I haven't verified the above assumption yet.

Now I tested it: removed Read permission from a change owner and run offline reindex.
Changes of that change owner *were* indexed (so my assumption above is not fully true)
but their "mergeable" flag wasn't computed. Starting Gerrit afterwards and executing a query like:

  status:open is:mergeable

didn't find these changes although they were mergeable.

Dave Borowitz

unread,
Sep 8, 2016, 11:15:56 AM9/8/16
to Saša Živkov, repo-d...@googlegroups.com
I talked about this with Jonathan a bit offline, and we think it might be ok to have ApprovalCopier/LabelNormalizer not bother trying to squash votes according to the current permission range. This is a behavior change, but we might be ok with it.

On Wed, Sep 7, 2016 at 9:55 AM, Saša Živkov <ziv...@gmail.com> wrote:
We are discussing two topics here:
1. (unnecessary) permission checks during (re)indexing against the current user
2. normalization of the approval values which is done during (re)indexing and which also involves permission checks


On Tue, Sep 6, 2016 at 10:39 PM, Dave Borowitz <dbor...@google.com> wrote:


On Tue, Sep 6, 2016 at 4:28 PM, Saša Živkov <ziv...@gmail.com> wrote:


On Tue, Sep 6, 2016 at 6:24 PM, Dave Borowitz <dbor...@google.com> wrote:
Which permissions are you specifically concerned about here?

Specifically, I have seen that computing mergeability checks Read permission on the target branch.
The code reaches RefControl.isVisible [1] and proceeds to canPerform(Permission.READ)
because getUser().isInternalUser() returns false.
I haven't yet checked if and what other permissions may also be evaluated. 

Good point, that would be nice to short-circuit.

With the current logic of evaluating permission against the change owner we can even
end-up into failing to (re)index a change. This can happen in a setup where projects
are not generally visible to all users but only to project owners and when a change owner
leaves the team and looses Read permission on the project. In that case the visibility check
returns false.
NOTE: I haven't verified the above assumption yet.

Therefore, I think we must use InternalUser when reindexing.

This is a good point, I agree that we should use InternalUser for this reason.

I think we probably have a similar bug in other codepaths as well. ChangeData#changeControl() assumes the change owner is the right user if it has not previously been initialized with another user.
 
 
In general my concern was about any permission evaluation during indexing... IMO we don't need
it and if we cannot easily avoid calling code which does permission evaluation then using an InternalUser is
the intended way to avoid that evaluation, right?


There is at least one corner case in NoteDb where we do actually have to check permissions during reindex. Since approvals in NoteDb are not copied forward in storage at patch set creation time, we run the ApprovalCopier via ChangeData#currentApprovals. ApprovalCopier (currently) uses permissions to truncate approvals to the allowed range for that user.

For that case we probably don't actually care what the user is in the ChangeControl we pass to ApprovalCopier#getForPatchSet, since it turns around and calls controlFor(otherUser).

I understand. However, this is a completely different thing... it doesn't depend on the current user and it
does some permission evaluation for a user who posted an approval on that change, not for the current user.
Conceptually this is not really a part of reindexing but more a lazy-update, if I understand it correctly.

You're correct that it's for a user who voted on a label, but it really does have to do it during reindexing.

Does it also has to be done when Gerrit is stopped and we do offline (re)indexing?
This seems not necessary if the approvals in NoteDb were already copied?

The problem is the approvals in NoteDb are *not* already copied. If you vote on PS1 with a sticky label and someone pushes PS2, NoteDb only recorded your PS1 vote, there is no vote on PS2 in NoteDb at all. The output of ApprovalsUtil#byPatchSet may contain a vote on PS2 due to copy rules, but that PatchSetApproval instance is not stored anywhere, it's computed on the fly each time.
 
 
In other words yes it's a lazy lookup but we do the lazy lookup when indexing a change.

ChangeField#LABEL looks up labels on the current patch set to store them in the index.
LABEL#get
-> ChangeData#currentApprovals
-> ApprovalsUtil#byPatchSet
-> ApprovalCopier#getForPatchSet
-> LabelNormalizer#normalize
-> LabelNormalizer#applyRightFloor
 
 
But passing InternalUser won't actually avoid most of the ACL evaluations.

It would avoid ACL evaluation at least in the example [1].
Any better ideas/proposals?

I do not have any ideas about how to avoid calling LabelNormalizer#applyRightFloor in the index path, sorry :(

I am right now more concerned about topic 1 and I think we agree that using InternalUser instead of the change owner
as the current user when reindexing a change is a possible bug fix.

Btw, another issue with evaluating permissions during offline reindexing is that it can produce
tons of "Unkown GroupMembership for UUID: user%3A..." warnings if a group backend is contributed
by a plugin (singleusergroup plugin) and a permission is assigned to a group from that backend ("user/joe" for the singleusergroup plugin).
There will be one such warning for each change from that project.

Ouch, getting rid of those warnings will be nice.

Permissions issues aside, I wonder if it's a problem in general that we don't load plugins in Reindex (or other programs). For example, should we call ChangeIndexedListeners from plugins?

Saša Živkov

unread,
Sep 8, 2016, 11:28:47 AM9/8/16
to Dave Borowitz, repo-d...@googlegroups.com
Understood :-)
 
 
 
In other words yes it's a lazy lookup but we do the lazy lookup when indexing a change.

ChangeField#LABEL looks up labels on the current patch set to store them in the index.
LABEL#get
-> ChangeData#currentApprovals
-> ApprovalsUtil#byPatchSet
-> ApprovalCopier#getForPatchSet
-> LabelNormalizer#normalize
-> LabelNormalizer#applyRightFloor
 
 
But passing InternalUser won't actually avoid most of the ACL evaluations.

It would avoid ACL evaluation at least in the example [1].
Any better ideas/proposals?

I do not have any ideas about how to avoid calling LabelNormalizer#applyRightFloor in the index path, sorry :(

I am right now more concerned about topic 1 and I think we agree that using InternalUser instead of the change owner
as the current user when reindexing a change is a possible bug fix.

Btw, another issue with evaluating permissions during offline reindexing is that it can produce
tons of "Unkown GroupMembership for UUID: user%3A..." warnings if a group backend is contributed
by a plugin (singleusergroup plugin) and a permission is assigned to a group from that backend ("user/joe" for the singleusergroup plugin).
There will be one such warning for each change from that project.

Ouch, getting rid of those warnings will be nice.

Permissions issues aside, I wonder if it's a problem in general that we don't load plugins in Reindex (or other programs).
 
Yes, it is caused by not loading (relevant) plugins. At least those plugins contributing group backends need to be loaded.
Currently, I think, daemon is the only program which loads plugins.
 
For example, should we call ChangeIndexedListeners from plugins?
 
I am not sure if we would always want that. Imagine there is a plugin which updates
an external system whenever a change is indexed. We may not want to trigger
such update(s) when doing offline reindexing.
Maybe an option to choose which plugins to load is the way to go?

Dave Borowitz

unread,
Sep 8, 2016, 11:32:17 AM9/8/16
to Saša Živkov, repo-d...@googlegroups.com
Agreed. For the specific case of ChangeIndexedListener, we could also document that it is not respected during offline reindexing.
 
Maybe an option to choose which plugins to load is the way to go?

Or let plugins themselves declare whether they prefer to be loaded in an offline program.

Dave Borowitz

unread,
Sep 15, 2016, 2:57:37 PM9/15/16
to Saša Živkov, repo-d...@googlegroups.com
On Thu, Sep 8, 2016 at 11:31 AM, Dave Borowitz <dbor...@google.com> wrote:


On Thu, Sep 8, 2016 at 11:28 AM, Saša Živkov <ziv...@gmail.com> wrote:


On Thu, Sep 8, 2016 at 5:15 PM, Dave Borowitz <dbor...@google.com> wrote:
I talked about this with Jonathan a bit offline, and we think it might be ok to have ApprovalCopier/LabelNormalizer not bother trying to squash votes according to the current permission range. This is a behavior change, but we might be ok with it.

Shawn is strongly against this. He makes the good point that the intention of revoking +2 is quite likely to prevent certain unsubmitted changes that were improperly +2'ed from getting submitted.

That said, I think we might be able to get away with not doing the permission checks and squashing during indexing anyway. It might be slightly confusing to have a change in the results for "label:Code-Review+2" and then you load the change screen and only see +1. But you will still see the +2 in the message log, so it's possible to figure out what's going on.

To do this we will need to do an index schema migration, creating a new field to store the new definition of labels. A bit of work but nothing we haven't done before.
Reply all
Reply to author
Forward
0 new messages