sheriff-o-matic feedback

16 views
Skip to first unread message

Dan Beam

unread,
Sep 17, 2014, 9:13:34 PM9/17/14
to Ojan Vafai, hackabi...@chromium.org, Dirk Pranke
Hey ojan@ and others,

dpranke@ mentioned I should send you some feedback after using sheriff-o-matic for a day.

Overall: I thought it was quite useful!

I really like that:
- sheriff-o-matic constantly refreshes itself
- it shows all the things that matter to a sheriff (recent failures with relevant info)

I'd be cool if it s-o-m showed:
- how recent each failure is (e.g. 5m ago)
- recent reverts and what they're addressing

Additionally, I didn't expect that clicking on "Builder/tester name" goes directly to failed build (e.g. http://bit.ly/1wFnfRX).  I expected clicks on "Builder/tester name" to go to ... builder/tester :) (e.g. http://bit.ly/1rcffof).  Unfortunately, I basically disregard failures until 2+ failed runs so the builder/tester's recent runs page (http://bit.ly/1rcffof) is more useful to me.

Thanks for this useful product!

--
Dan Beam

Ojan Vafai

unread,
Sep 18, 2014, 8:11:49 PM9/18/14
to Dan Beam, hackability-cy, Dirk Pranke
On Wed, Sep 17, 2014 at 6:13 PM, Dan Beam <db...@chromium.org> wrote:
Hey ojan@ and others,

dpranke@ mentioned I should send you some feedback after using sheriff-o-matic for a day.

Overall: I thought it was quite useful!

Yay! 

I really like that:
- sheriff-o-matic constantly refreshes itself
- it shows all the things that matter to a sheriff (recent failures with relevant info)

The short-term goal is that a sheriff's *only* job should be to address the alerts show in this page.
 
I'd be cool if it s-o-m showed:
- how recent each failure is (e.g. 5m ago)

What would you do with this information? I'm trying to understand why people want this so I can come up with the right, minimal UI.
 
- recent reverts and what they're addressing

I mentioned something like this in https://code.google.com/p/chromium/issues/detail?id=401879#c7, but I guess we could implement this without implementing the revert button. The idea I had is to show in in the regression range which of the patches have already been reverted. Would that be enough or do you actually want a log of recent reverts? If the latter, can you say how you'd use it?

Additionally, I didn't expect that clicking on "Builder/tester name" goes directly to failed build (e.g. http://bit.ly/1wFnfRX).  I expected clicks on "Builder/tester name" to go to ... builder/tester :) (e.g. http://bit.ly/1rcffof).  Unfortunately, I basically disregard failures until 2+ failed runs so the builder/tester's recent runs page (http://bit.ly/1rcffof) is more useful to me.

Yeah. I agree that this would be a good change for other reasons. Filed crbug.com/415800.

For this specific issue, sheriff-o-matic does this work for you. Failures that have only happened on one bot for one run show up in a separate list below the reliable failures list: "Failures that have only happened once (on one bot)". At least...it tries to do that. Might have bugs.

Thanks for this useful product!

Thanks for the feedback!
 

Dan Beam

unread,
Sep 18, 2014, 10:27:56 PM9/18/14
to Ojan Vafai, hackability-cy, Dirk Pranke
On Thu, Sep 18, 2014 at 5:11 PM, Ojan Vafai <oj...@chromium.org> wrote:
On Wed, Sep 17, 2014 at 6:13 PM, Dan Beam <db...@chromium.org> wrote:
Hey ojan@ and others,

dpranke@ mentioned I should send you some feedback after using sheriff-o-matic for a day.

Overall: I thought it was quite useful!

Yay! 

I really like that:
- sheriff-o-matic constantly refreshes itself
- it shows all the things that matter to a sheriff (recent failures with relevant info)

The short-term goal is that a sheriff's *only* job should be to address the alerts show in this page.
 
I'd be cool if it s-o-m showed:
- how recent each failure is (e.g. 5m ago)

What would you do with this information? I'm trying to understand why people want this so I can come up with the right, minimal UI.

This was mainly motivated by forgetting something's been fixed (or starting a new sheriff shift).  Sometimes I see email after a failure has already been addressed and am unsure whether there's still a problem.  It'd also be useful to prioritize fixing things (e.g. this has been broken for 2h! other thing only broke 1m ago.)

The ability to just put notes on an item (tracked by username, with local time) is probably more useful, e.g.

  eglaysher@: fixed this by reverting d34db33f. (9/18 11:35am)

OR

  dbeam@: probably just a flake. (9/18 12:28pm)

OR

  pgervais@: rebooting the bot. (9/18 3:14pm)

Basically just a per-failure status.  We do this already by concatenating them all together in the status.
 
 
- recent reverts and what they're addressing

I mentioned something like this in https://code.google.com/p/chromium/issues/detail?id=401879#c7, but I guess we could implement this without implementing the revert button. The idea I had is to show in in the regression range which of the patches have already been reverted. Would that be enough or do you actually want a log of recent reverts? If the latter, can you say how you'd use it?

Additionally, I didn't expect that clicking on "Builder/tester name" goes directly to failed build (e.g. http://bit.ly/1wFnfRX).  I expected clicks on "Builder/tester name" to go to ... builder/tester :) (e.g. http://bit.ly/1rcffof).  Unfortunately, I basically disregard failures until 2+ failed runs so the builder/tester's recent runs page (http://bit.ly/1rcffof) is more useful to me.

Yeah. I agree that this would be a good change for other reasons. Filed crbug.com/415800.

For this specific issue, sheriff-o-matic does this work for you. Failures that have only happened on one bot for one run show up in a separate list below the reliable failures list: "Failures that have only happened once (on one bot)". At least...it tries to do that. Might have bugs.

I noticed that today; nice feature.

--
Dan Beam

Dirk Pranke

unread,
Sep 18, 2014, 10:41:06 PM9/18/14
to Ojan Vafai, hackability-cy, Dan Beam
For what it's worth, this worked maybe only  once or twice for me. Most failures were "reliable", even when they had only happened once. So, yeah .. might have bugs ;).

-- Dirk

Julie Parent

unread,
Sep 18, 2014, 10:42:46 PM9/18/14
to Dan Beam, Ojan Vafai, hackability-cy, Dirk Pranke
On Thu, Sep 18, 2014 at 7:27 PM, Dan Beam <db...@chromium.org> wrote:
On Thu, Sep 18, 2014 at 5:11 PM, Ojan Vafai <oj...@chromium.org> wrote:
On Wed, Sep 17, 2014 at 6:13 PM, Dan Beam <db...@chromium.org> wrote:
Hey ojan@ and others,

dpranke@ mentioned I should send you some feedback after using sheriff-o-matic for a day.

Overall: I thought it was quite useful!

Yay! 

I really like that:
- sheriff-o-matic constantly refreshes itself
- it shows all the things that matter to a sheriff (recent failures with relevant info)

The short-term goal is that a sheriff's *only* job should be to address the alerts show in this page.
 
I'd be cool if it s-o-m showed:
- how recent each failure is (e.g. 5m ago)

What would you do with this information? I'm trying to understand why people want this so I can come up with the right, minimal UI.

This was mainly motivated by forgetting something's been fixed (or starting a new sheriff shift).  Sometimes I see email after a failure has already been addressed and am unsure whether there's still a problem.  It'd also be useful to prioritize fixing things (e.g. this has been broken for 2h! other thing only broke 1m ago.)

The ability to just put notes on an item (tracked by username, with local time) is probably more useful, e.g.

We had a plan originally to do exactly this, but Ojan felt that associating with a bug instead was a better way to go about it (https://code.google.com/p/chromium/issues/detail?id=399734).  Sounds like a vote for a lightweight notes section rather than more cumbersome bug?
 
  eglaysher@: fixed this by reverting d34db33f. (9/18 11:35am)

OR

  dbeam@: probably just a flake. (9/18 12:28pm)

OR

  pgervais@: rebooting the bot. (9/18 3:14pm)

Basically just a per-failure status.  We do this already by concatenating them all together in the status.
 
 
- recent reverts and what they're addressing

I mentioned something like this in https://code.google.com/p/chromium/issues/detail?id=401879#c7, but I guess we could implement this without implementing the revert button. The idea I had is to show in in the regression range which of the patches have already been reverted. Would that be enough or do you actually want a log of recent reverts? If the latter, can you say how you'd use it?

Additionally, I didn't expect that clicking on "Builder/tester name" goes directly to failed build (e.g. http://bit.ly/1wFnfRX).  I expected clicks on "Builder/tester name" to go to ... builder/tester :) (e.g. http://bit.ly/1rcffof).  Unfortunately, I basically disregard failures until 2+ failed runs so the builder/tester's recent runs page (http://bit.ly/1rcffof) is more useful to me.

Yeah. I agree that this would be a good change for other reasons. Filed crbug.com/415800.

For this specific issue, sheriff-o-matic does this work for you. Failures that have only happened on one bot for one run show up in a separate list below the reliable failures list: "Failures that have only happened once (on one bot)". At least...it tries to do that. Might have bugs.

I noticed that today; nice feature.

--
Dan Beam


Thanks for this useful product!

Thanks for the feedback!
 


--
You received this message because you are subscribed to the Google Groups "Chromium Hackability Code Yellow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to hackability-c...@chromium.org.
To post to this group, send email to hackabi...@chromium.org.
To view this discussion on the web visit https://groups.google.com/a/chromium.org/d/msgid/hackability-cy/CANpe7K3TduWra72jV01Ppv%2BjThZEw3TpwVqEzb8JY8ENHKECTQ%40mail.gmail.com.

Dirk Pranke

unread,
Sep 18, 2014, 10:49:32 PM9/18/14
to Julie Parent, Dan Beam, Ojan Vafai, hackability-cy
On Thu, Sep 18, 2014 at 7:42 PM, Julie Parent <jpa...@chromium.org> wrote:


On Thu, Sep 18, 2014 at 7:27 PM, Dan Beam <db...@chromium.org> wrote:
On Thu, Sep 18, 2014 at 5:11 PM, Ojan Vafai <oj...@chromium.org> wrote:
On Wed, Sep 17, 2014 at 6:13 PM, Dan Beam <db...@chromium.org> wrote:
Hey ojan@ and others,

dpranke@ mentioned I should send you some feedback after using sheriff-o-matic for a day.

Overall: I thought it was quite useful!

Yay! 

I really like that:
- sheriff-o-matic constantly refreshes itself
- it shows all the things that matter to a sheriff (recent failures with relevant info)

The short-term goal is that a sheriff's *only* job should be to address the alerts show in this page.
 
I'd be cool if it s-o-m showed:
- how recent each failure is (e.g. 5m ago)

What would you do with this information? I'm trying to understand why people want this so I can come up with the right, minimal UI.

This was mainly motivated by forgetting something's been fixed (or starting a new sheriff shift).  Sometimes I see email after a failure has already been addressed and am unsure whether there's still a problem.  It'd also be useful to prioritize fixing things (e.g. this has been broken for 2h! other thing only broke 1m ago.)

The ability to just put notes on an item (tracked by username, with local time) is probably more useful, e.g.

We had a plan originally to do exactly this, but Ojan felt that associating with a bug instead was a better way to go about it (https://code.google.com/p/chromium/issues/detail?id=399734).  Sounds like a vote for a lightweight notes section rather than more cumbersome bug?

Bugs are a good underlying data store, for obvious reasons. If we could make it really lightweight to create a bug, view comments, and add new comments, that might be ideal. I particularly like this in combination with the idea of tracking the open gardening/sheriffing-related bugs and showing them somehow in the tool as well (a la the gardening-blink label)

-- Dirk

Dirk Pranke

unread,
Sep 18, 2014, 10:51:37 PM9/18/14
to Julie Parent, Dan Beam, Ojan Vafai, hackability-cy
On Thu, Sep 18, 2014 at 7:49 PM, Dirk Pranke <dpr...@chromium.org> wrote:


On Thu, Sep 18, 2014 at 7:42 PM, Julie Parent <jpa...@chromium.org> wrote:


On Thu, Sep 18, 2014 at 7:27 PM, Dan Beam <db...@chromium.org> wrote:
On Thu, Sep 18, 2014 at 5:11 PM, Ojan Vafai <oj...@chromium.org> wrote:
On Wed, Sep 17, 2014 at 6:13 PM, Dan Beam <db...@chromium.org> wrote:
Hey ojan@ and others,

dpranke@ mentioned I should send you some feedback after using sheriff-o-matic for a day.

Overall: I thought it was quite useful!

Yay! 

I really like that:
- sheriff-o-matic constantly refreshes itself
- it shows all the things that matter to a sheriff (recent failures with relevant info)

The short-term goal is that a sheriff's *only* job should be to address the alerts show in this page.
 
I'd be cool if it s-o-m showed:
- how recent each failure is (e.g. 5m ago)

What would you do with this information? I'm trying to understand why people want this so I can come up with the right, minimal UI.

This was mainly motivated by forgetting something's been fixed (or starting a new sheriff shift).  Sometimes I see email after a failure has already been addressed and am unsure whether there's still a problem.  It'd also be useful to prioritize fixing things (e.g. this has been broken for 2h! other thing only broke 1m ago.)

The ability to just put notes on an item (tracked by username, with local time) is probably more useful, e.g.

We had a plan originally to do exactly this, but Ojan felt that associating with a bug instead was a better way to go about it (https://code.google.com/p/chromium/issues/detail?id=399734).  Sounds like a vote for a lightweight notes section rather than more cumbersome bug?

Bugs are a good underlying data store, for obvious reasons. If we could make it really lightweight to create a bug, view comments, and add new comments, that might be ideal. I particularly like this in combination with the idea of tracking the open gardening/sheriffing-related bugs and showing them somehow in the tool as well (a la the gardening-blink label)

Arr, hit send too soon. I think lightweight is more important than 'actual bug', though. Creating new bugs and adding comments should be one-click operations as much as possible. Eventually I'd like to see a one-click revert option and a one-click disable/suppress option as well (though obviously those are much harder to implement).

-- Dirk

Ojan Vafai

unread,
Sep 18, 2014, 11:04:19 PM9/18/14
to Dirk Pranke, Douglas Stockwell, Julie Parent, Dan Beam, hackability-cy
On Thu, Sep 18, 2014 at 7:49 PM, Dirk Pranke <dpr...@chromium.org> wrote:
On Thu, Sep 18, 2014 at 7:42 PM, Julie Parent <jpa...@chromium.org> wrote:
On Thu, Sep 18, 2014 at 7:27 PM, Dan Beam <db...@chromium.org> wrote:
On Thu, Sep 18, 2014 at 5:11 PM, Ojan Vafai <oj...@chromium.org> wrote:
On Wed, Sep 17, 2014 at 6:13 PM, Dan Beam <db...@chromium.org> wrote:
Hey ojan@ and others,

dpranke@ mentioned I should send you some feedback after using sheriff-o-matic for a day.

Overall: I thought it was quite useful!

Yay! 

I really like that:
- sheriff-o-matic constantly refreshes itself
- it shows all the things that matter to a sheriff (recent failures with relevant info)

The short-term goal is that a sheriff's *only* job should be to address the alerts show in this page.
 
I'd be cool if it s-o-m showed:
- how recent each failure is (e.g. 5m ago)

What would you do with this information? I'm trying to understand why people want this so I can come up with the right, minimal UI.

This was mainly motivated by forgetting something's been fixed (or starting a new sheriff shift).  Sometimes I see email after a failure has already been addressed and am unsure whether there's still a problem.  It'd also be useful to prioritize fixing things (e.g. this has been broken for 2h! other thing only broke 1m ago.)

The tool already sorts these in reverse order for you, putting the newest things at the top. But, maybe showing times to make that explicit and clear is worth doing. File https://code.google.com/p/chromium/issues/detail?id=415855
 
The ability to just put notes on an item (tracked by username, with local time) is probably more useful, e.g.

We had a plan originally to do exactly this, but Ojan felt that associating with a bug instead was a better way to go about it (https://code.google.com/p/chromium/issues/detail?id=399734).  Sounds like a vote for a lightweight notes section rather than more cumbersome bug?

Bugs are a good underlying data store, for obvious reasons. If we could make it really lightweight to create a bug, view comments, and add new comments, that might be ideal. I particularly like this in combination with the idea of tracking the open gardening/sheriffing-related bugs and showing them somehow in the tool as well (a la the gardening-blink label)

There's a couple things that might help this:
1. Extend snooze to let you put in a revision the failure is fixed that. So then it won't unsnooze until that failure is still happening on a bot after that revision. I'm waiting for snooze to actually be stored on the server before extending it to do more.
2. An easy way to file a bug from sheriff-o-matic, i.e. a link that would take you to a bug prefilled with the right information. cbiesinger is working on this. I can't seem to find the bug for it at the moment.

There's noone working on the followup of showing bug titles in the UI or making it so that the bugs filed get linked back to sheriff-o-matic automatically (you have to click the file bug link, file the bug and then copy-paste the bug number back into sheriff-o-matic). Help on making any of the above happen is very welcome. Mainly it needs some server-side integration that interacts with the Google Code Issue Tracker API.

The advantage of a bug is that you can have a discussion about it and everyone can see the whole discussion. I was planning on finishing the bug integration and making it very lightweight first and then seeing if we need something more lightweight. Maybe we should do notes now though since it'd be easy to implement. The hard part is figuring out the right, minimal UI. We can't just add a 4th button. The 3 buttons is already a little too much.
Reply all
Reply to author
Forward
0 new messages