Tools for DevOps collaboration

172 views
Skip to first unread message

peco

unread,
Dec 7, 2011, 5:28:01 PM12/7/11
to devops-toolchain
At the heart of successful devOps is good team communication and
collaboration. It becomes especially important as the dev & ops teams
become more distributed and grow in count.
What gems are people using for effective dev&ops collab? I am
especially interested in the troubleshooting, issue tracking,
production issue solving aspect of things. A ticketing system is not
quite the answer although it would certainly participate in the flow.
Email is not quite it either.
Troubleshooting a big system involving several teams requires crisp
clear and timely echange of detailed tech info and analysis, is there
anything good out there?

Cheers

Adam Rosien

unread,
Dec 7, 2011, 6:12:24 PM12/7/11
to devops-t...@googlegroups.com
I think chat/IRC/Jabber is the best tool for what you're describing. You can archive it, have bots perform tasks or relate status, and separate rooms for various groups are all great features.

.. Adam

Stan Chan

unread,
Dec 7, 2011, 6:15:58 PM12/7/11
to devops-t...@googlegroups.com
I will have to +1 to most of the Atlassian products in my opinion.  I worked with all their products and have to say that they are well worth the price you have to pay for them and their prices are pretty reasonable.  They have tight integration with each component of their product line and works well across the whole organization to connect many aspects of the development lifecycle.  I worked with many organizations in the past and present ranging from large fortune 500s to small startups and Atlassian has been instrumental to the success of all of our projects.  I do not work for Atlassian if someone wants to ask. :-)

Scott Smith

unread,
Dec 7, 2011, 6:32:18 PM12/7/11
to devops-t...@googlegroups.com, devops-t...@googlegroups.com
They all have their pros and cons. I choose to judge based on two things: API quality and the number of clicks required to manipulate a ticket. 

Right now github issues is winning my vote, hands down. Notifications suck and labels are super basic, but the web interface is concise when you need to use it. Can't say the same about anything else out there, really.

Afaik Jira still doesn't have a REST API soooo...

Will Lowe

unread,
Dec 7, 2011, 6:35:08 PM12/7/11
to devops-t...@googlegroups.com
+1 on Jira and IRC/Campfire/IM.  But opening and closing Jira tickets is too slow during a real firedrill.

For those we've found a shared Google Spreadsheet to be pretty handy.  A bunch of rows like this:

| Issue Summary | Architecture/Product Component | Fix Owner | Status | Notes | ...

... goes a long way towards keeping the entire team focused during a big problem.

Will

Ernest Mueller

unread,
Dec 7, 2011, 8:00:01 PM12/7/11
to devops-t...@googlegroups.com

Hmm, I don't think DevOps collaboration is just for super emergent issues, so let me answer with a longer list.

  • We use ZenDesk as a service desk for request management. It has problem/incident functionality too but it's not really good for working a long term issue with multiple participants because is just a big threaded discussion, we end up doing problem management on a Confluence wiki page that may have subsidiary bug reports etc.
  • We use Campfire for chat and also log system changes from some automated sources into it.
  • We use Confluence wiki pages for release management.
  • We use a customized version of HP's PPM for bug tracking (dev and ops both)
  • We use Perforce for source control (dev and ops both)
  • We use TFS with telerik's TeamPulse addon (did use JIRA+Greenhopper but had whole-department level change) for task cards and burndown (dev and ops both)
  • We use WebEx for phone bridges (too often, in an emergent situation people don't type into chat enough, and it allows you to say "I'M TALKING TO  YOU!"
  • We have a mature and custom test procedure/test execution repository from our product side

I'm not really too happy with our suite of process tools overall.  I don't want some big ol' nasty suite, but I'd like something more integrated that could do basic change management (request and tracking), problem/incident management, and request management all in one without having to just make one off wiki pages.

Specific shortcomings:
1. Change logging to Campfire is good for communication but probably isn't a real auditable change log. And begs the question of change requests. We're not all 3l33t and Facebooky about "make a code change and push to production that day," we have ISO processes and whatnot we're working with.
2. Tasks that are planned in an iteration work well in TFS, but other unplanned tasks end up going into email and being dropped by both devs and ops. Considering having both devs and ops use ZenDesk to track "random other tasks" but a more GTD system would be nice.
3. Where's something that does problem/incident well at all?  Haven't seen it.
4. Integration.  Project management to tasks, problems/incidents to bugs, anything to wiki pages - it's a mess.

For emergent issues that get large and go longer than an immediate fix it pretty much goes: (using ICS for roles)  alert or report -> ticket in ZenDesk -> chat with anyone who happens to be online in Campfire -> trying to pull others in via email and phone tree -> Mix of Campfire and WebEx -> emails that get merged into a wiki doc -> specific bug reports if needed or ops intervention logged in Campfire -> fixes checked into Perforce -> change requests to release manager who also uses a wiki doc -> moves to dev/test requested in ZenDesk (or just in chat if the issue is an ongoing sev1 kind of thing) -> test executions -> release coordinated in Campfire.

I like the look of Datadog's "facebook for operations" product for collab too.

Ernest
______________________
UN-altered REPRODUCTION and DISSEMINATION of
this IMPORTANT information is ENCOURAGED.


Inactive hide details for Will Lowe ---12/07/2011 05:35:53 PM---+1 on Jira and IRC/Campfire/IM.  But opening and closing Jira tWill Lowe ---12/07/2011 05:35:53 PM---+1 on Jira and IRC/Campfire/IM.  But opening and closing Jira tickets is too slow during a real fire

Peco Karayanev

unread,
Dec 8, 2011, 1:00:45 PM12/8/11
to devops-t...@googlegroups.com
+1 on having an open API. Any collab tool should support this.
 
Dell has inherited a suite from MessageOne called Incident Command Center. I am not sure how much of a  vaporware it is..
 
 
Came across this GeneralDynamics software CoMotion that allows more of a shared workspace for analysis and communication in context:
We need something like this but with devops focus!
graycol.gif

Stan Chan

unread,
Dec 8, 2011, 1:22:20 PM12/8/11
to devops-t...@googlegroups.com

Yaakov Nemoy

unread,
Dec 8, 2011, 5:14:43 PM12/8/11
to devops-t...@googlegroups.com
We're using HipChat because it provides that benefit of everyone
always idling on IRC without actually keeping connections open
everywhere. The less technical people in our org get all he same
benefits and a GUI they can use and the more technical people can
connect to it using Jabber or IRC using bitlbee.

-Yaakov

Thompson Lee

unread,
Dec 8, 2011, 9:05:43 PM12/8/11
to devops-t...@googlegroups.com
Smartsheet is pretty cool too.

Gaveen Prabhasara

unread,
Dec 9, 2011, 12:14:31 AM12/9/11
to devops-t...@googlegroups.com
On Thu, Dec 8, 2011 at 4:42 AM, Adam Rosien <ad...@rosien.net> wrote:
> I think chat/IRC/Jabber is the best tool for what you're describing. You can
> archive it, have bots perform tasks or relate status, and separate rooms for
> various groups are all great features.
>
We are using internally hosted XMPP/Jabber server, Openfire to be precise
with a Jappix frontend for web-based access, and persistent chats enabled.

--
Gaveen Prabhasara

Spike Morelli

unread,
Dec 9, 2011, 8:01:06 PM12/9/11
to devops-t...@googlegroups.com
great topic Peco, we've just started trying to improve our comms/stack so I'm glad to see this thread popping up.

On 7 Dec 2011, at 22:28, peco wrote:

What gems are people using for effective dev&ops collar?

spoken word. Seriously, I appreciate the importance of good tools, but nothing will replace the clarity of the message and emotions that you get in a face to face meeting. I currently manage a distributed team so we don't get as much face to face time, but a couple of us are on site and most times we'll just get a room for 10 mins and talk things over. This includes discussing blockers, new projects and so forth.

Other than that our stack looks like:
* Github pull request to share code (say when we add monitoring code to their apps or modify config related stuff)
* Github wiki pages for project planning, postmortems/restrospectives and deployments documents (I have templates to fill out to simplify the job)
* campfire for chat (mostly because it's easier on some of the non eng folks)
* rally for scrum and kanban, altho I would strongly not recommend it, but I would recommend something that allows you to do kanban for ops and scrum for dev. I prefer this to more traditional BTSs simply because the flow is generally simpler, especially if the product is well done there's less overhead than in many ticketing systems (and btw you really want something supporting swim lanes)
* skype/something to use voice. This is the next best thing to face to face so I recommend you try and push for it and I say push because ime engineers will gravitate toward text based comms whenever possible.
* email for some comms, but we try to avoid it as it generally takes longer, cause more friction being devoid of emotion and falls through the crack as ops tend to get swamped with cron emails and that kind of stuff

I'd like to conclude by saying that in my experience the thing that went the farthest in improving our comms was creating process around it and communicating (no joke) the importance of it. I honestly doubt there will ever be a tool that encompasses all the comms needs in a devops team so the best chance is to let people know how important it is for them to take initiative and talk to people.

hope this helps,

Scott Smith

unread,
Dec 9, 2011, 8:40:09 PM12/9/11
to devops-t...@googlegroups.com, devops-t...@googlegroups.com
Just gonna address what you said about cron mail. If you're getting email cron output, either something IS wrong or you are DOING something wrong.

-scott

Mohit Chawla

unread,
Dec 10, 2011, 2:55:46 AM12/10/11
to devops-t...@googlegroups.com
We use Request Tracker, works nicely for us. A big advantage is to be
able to use and write plugins. For eg., we currently send nagios &
pingdom notifications to RT, and nagios/pingdom extensions can merge
tickets for alerts, resolve them on recovery, and other stuff. Not
sure if JIRA has such an ecosystem of plugins etc. Also, since
everything can be done easily on email ( new tickets, comments,
replies, notifications ), there's not much to complain or use from the
web interface, except administration related tasks.

Spike Morelli

unread,
Dec 10, 2011, 12:05:54 PM12/10/11
to devops-t...@googlegroups.com
On 10 Dec 2011, at 01:40, Scott Smith wrote:

Just gonna address what you said about cron mail. If you're getting email cron output, either something IS wrong or you are DOING something wrong.

Fair point, I should have been clearer (poor communication on my side heh ). When I said "cron email" I meant a largish volume of emails from automated systems. I appreciate that if the message is not important it should not have been sent in the first place and in that sense you could say "you're doing something wrong", but in my experience this is easier in theory than in practice. What I've frequently seen is stuff configured to report lower impact events via email, nagios disk full 80% or log exceptions from the dev environment for example, and when you have a largish pool of machines that will generate enough emails to make ops, and to a less extent devs, less thrilled by email conversations.

I hope that help clarifying what I meant.

thanks,

Eric Shamow

unread,
Dec 12, 2011, 3:50:41 PM12/12/11
to devops-t...@googlegroups.com

On Saturday, December 10, 2011 at 12:05 PM, Spike Morelli wrote:
Fair point, I should have been clearer (poor communication on my side heh ). When I said "cron email" I meant a largish volume of emails from automated systems. I appreciate that if the message is not important it should not have been sent in the first place and in that sense you could say "you're doing something wrong", but in my experience this is easier in theory than in practice. What I've frequently seen is stuff configured to report lower impact events via email, nagios disk full 80% or log exceptions from the dev environment for example, and when you have a largish pool of machines that will generate enough emails to make ops, and to a less extent devs, less thrilled by email conversations.

I don't mean to belabor the point, but this is I believe exactly what Scott meant by "you're doing it wrong."  Cron output shouldn't be informational.  It should be a hard stop, "I broke" message.  Informational info should be aggregated and controlled through monitoring.

As somebody who walked into a job a few years ago where 10K cron mails in an evening was not uncommon, I sympathize with the "easier in theory than in practice."  On the other hand, I was able to use these mails as a roadmap to the least touched servers in my environment - and thus those most in need of rehabilitation.

The danger is that the constant bleat of "expected" cron mail - "oh yeah, there's server xyz at 92% again, but we can't do anything until the new disks come in next quarter" - causes information fatigue and makes sysadmins ignore the really critical cron messages.  An e-mail from cron should be the kind of thing that makes you sit up and notice.


-- 

Eric Shamow
Professional Services

Zaeem Arshad

unread,
Dec 13, 2011, 5:00:57 AM12/13/11
to devops-t...@googlegroups.com

Great thread indeed. We are a small team and located on the same floor. I have found Atlassian Confluence and JIRA to be extremely useful coupled with GreenHopper. Some of the recommendations on this list look really interesting though.


Regards

--
Zaeem
Reply all
Reply to author
Forward
0 new messages