Help with making the case for opening up software development

Anony Moose

unread,

Feb 18, 2014, 12:46:23 AM2/18/14

to us-govern...@googlegroups.com

Hi,

I work for a U.S. government organization that produces a lot of software, and I have been trying to promote a culture of open-source for several years now, but have been frustrated by the slow pace of change.

I'm hoping to get some help addressing some of the concerns that our policy-makers have. I'd especially like to hear about specific case-studies, of other organizations where these concerns were addressed, and how.

TL;DR - These are in order of decreasing importance:
1. Can developers be allowed to push directly to GitHub, without every commit going through a review process?
2. What to say to those who don't want to expose our messy coding practices?
3. Won't it be a lot more work to support community interaction?
4. Isn't it okay (good, even) to put unfinished / unpolished code out in the open?

1. The main concern is with security. Now, I understand that we can't expose our internal infrastructure, such as server names and directory paths (not to mention credentials). And I understand and agree that that means that any *existing* software should be reviewed before it is put on GitHub. But, our security group is saying that this also means that no developer can ever push directly to a public repository; that every change-set would have to get reviewed before being pushed to public. This seems too strict to me, and I think it would add a lot of friction, and hinder real collaboration between our devs and the community. But, maybe I'm wrong. Do other federal agencies operate under these restrictions?

2. Another major argument against putting our code in the open is that we wouldn't want every commit of every developer out in the open, where anyone could see them. There are two issues here, really: A, fear of embarrassment (or maybe it is fear that the developers would feel embarrassed -- I'm not sure), and B, potentially expose us to criticism and/or liability, since some of what we do is of a political nature. To address this, an agency could come up with workflows where we push changes in batches, after a review process, and rebasing to combine commits. But here, again, I think the cost of added friction would far outweigh the benefits.

3. A third argument is that it would take too much time and effort to interact with the community and deal with spurious or low-quality issues and pull requests. I have some sympathy with this concern. We do have help desks with dedicated employees to deal with a lot of our existing interactions, and I'm curious about how other groups deal with this issue. I imagine that some very visible projects like "We The People" must have been deluged. If a project is on GitHub, is there the potential to get inundated with (let's say) non-value-adding feedback? If so, how do you deal with it?

4. Finally, there is the idea that our code has to be in really good, clean shape, and has to be reusable by the public, before we can put it out there. This is similar to the "embarrassing" issue, but not quite the same. I'm of the opinion that we should *start* projects in the open (on GitHub or wherever), and develop them that way from the beginning, and that it's even okay if our organization gets littered with a lot of half-baked and abandoned projects. I've seen a couple of organizations that even put things up like website code (data.gov, for instance) which, it seems to me, wouldn't ever be something that someone in the public would have a lot of use for.

Thanks for any feedback!

P.S. I'm writing this anonymously, even though I realize it's pretty ironic, because I'm afraid people at my org. who see it might think it reflects badly on us, and that would, of course, hurt my case.

Heyman, Leigh

unread,

Feb 18, 2014, 10:52:55 PM2/18/14

to Anony Moose, us-govern...@googlegroups.com

Hi Anon,

A couple quick notes about your anonymity before I respond to your questions. First and foremost, we want to help. There are plenty of folks here who are just a phone call away, and who would be more than happy to help you engage with your leadership, but we can't do that if we don't know who you are. Second, by its very nature, if you're going to develop in public you have to be open to the possibility of "looking bad" occasionally (some might say frequently), so sooner or later you’re going to have to take that leap of faith one way or another. J Perhaps taking that risk with a sympathetic community like ours might actually give the opportunity to demonstrate to your organization the positive effects of doing so?

So, with that being said I will do my best to address your concerns as openly and honestly as your anonymity will allow.

Preface: All concepts that follow depend on a high degree of faith in your engineers' professional judgment. If your relationship with your dev, and ops, and security engineers is at all adversarial then you have bigger problems that might need to be addressed first.

The first thing I'll say is that iterative development and continuous integration (or as close as you can get to it) is your friend. I think high-frequency, small releases help solve a lot the problems you've raised.

Put simply, the more frequently you release, the smaller amount of code goes in to each release, thus reducing the burden of review. This benefits you in two ways. First, it allows you to integrate a lighter-weight review process in to your development work flow, which, if done well also happens to get you better code all around. Second, I presume the "review process" you refer to means an external review (presumably by security/comms personnel), by producing a high volume of review requests to the external stakeholders, the vast majority of which are small, less significant commits, allows you to a) quickly establish a pattern of trust with those stakeholders and b) solidify the notion that not all commits are equal. Thus, hopefully over a reasonably short time this will allow you to shift away from a policy of "all commits must be reviewed" to "critical commits must be reviewed." And having previously established that trust, hopefully that revised policy will place trust in the engineers to determine which commits are "critical" or at least establish a bright line of some sort, e.g. all commits over N lines must be reviewed, or all commits that touch authentication, or any commit messages or issue updates that touch on policy, timeline, roadmaps, etc. etc.

The rest of my thoughts are in-line below.

TL;DR - These are in order of decreasing importance:
1. Can developers be allowed to push directly to GitHub, without every commit going through a review process?

No. In fact I submit you actually don't want this anyway. But per above, it all depends on how you shape the review process itself. A development workflow that requires peer review of all commits allows for an opportunity to flag potential commits that require a higher level review, as well as common sense review of commit messages. One option here is to use a combination of private and public repos so you can do your code review in the private repo then pull to public when it passes review.

Also, you get better code.

2. What to say to those who don't want to expose our messy coding practices?

A. Say "let’s stop using messy coding practices." As you mentioned a couple times, if you separate the problem of existing code from new code, this becomes a bit more realistic. If you establish in the culture that you're building for open source, even with code that will never be part of a public repo, you essentially challenge your engineers to develop and indoctrinate the kinds of practices that no longer raise those types of concerns. As Ben points out in the post I just linked, there's also a level of self-awareness that occurs— people who KNOW their code is going to be open source write code differently.

And again, you get better code. And,

B. Accept that on some level, all code is always messier than you'd like it to be. One of the core concepts of open-source development is transparency— it's as much about community-building as it is about the code itself— and with that comes a level of intellectual honesty. In that context, trying to present the world with fully polished code and perfect coding practices and hiding away the warts is the open-source equivalent of astroturfing.

3. Won't it be a lot more work to support community interaction?

For this one, it's really as much or as little as you want it to be. Obviously more begets more, and less begets less. But one viable strategy is to time box it. Basically have one engineer for a set period of time, say four or eight hours per week, triage the issue queues and pull requests. If you rotate that responsibility weekly across your dev team, the impact ends up being surprisingly low. Theoretically, if your engagement is effective this time box will need to grow over time, but hopefully by then not only will your leadership be more bought in to it, but you’ll also have a better sense of the value added to your own programs by the community with which you’re engaging, and have a solid footing on which to base any cost-benefit tradeoffs.

4. Isn't it okay (good, even) to put unfinished / unpolished code out in the open?

Yes. See response to #2 above. This resistance is always present initially, especially when dealing with legacy code. Engineers always want more time to "polish the code" before making it public: "I don't want my name associated with this crappy code," "this is embarrassing and will make us look bad," etc. etc. But let me ask you, have you ever met a developer who said "this piece code is exactly how I want it"? J As I mentioned in my intro There's a point at which you just have to take the leap of faith. There will always be detractors, no matter how good the code is, and that simply can't be helped. But at the end of the day the community can tell the difference between honest engagement vs. "checking the open source box," and it's been my experience that the former invites allies and the latter invites detractors.

Also, one of the other core concepts is that open source means that you get help with your code. If you're only publishing polished code you're actually denying yourself one of the biggest benefits, so why bother in the first place?

To sum up, there are two separate levels of concern here. First is the institutional fear, and I think the only way to overcome it is to embrace that fear by engaging with those concerned, and welcoming a level of oversight that both allays those fears and gets you better code, and if you implement that oversight right, it will be less burdensome than you expect, and will also lighten over time as leadership becomes more comfortable with the notion. The second is engineering fear – the "I don't want my bad code out there" argument – which I think you tackle by challenging the engineers to evolve. If the current processes are too messy for their standards, then you can engage with them and challenge them to come up with, and adhere to new ones.

I guess what I’m saying is that the cultural changes cannot be achieved simply in the context of a desire for an open-source culture, but that they need to be part of a larger holistic shift in how you develop code. But then I’m probably stating the obvious there.

Anyway, I know this was long, but I hope it helps, and I’d be happy to talk more, if only I knew who you were…

-L

Eric Mill

unread,

Feb 18, 2014, 11:14:43 PM2/18/14

to Heyman, Leigh, Anony Moose, us-govern...@googlegroups.com

On Tue, Feb 18, 2014 at 10:52 PM, Heyman, Leigh <Leigh_...@oa.eop.gov> wrote:

1. Can developers be allowed to push directly to GitHub, without every commit going through a review process?

No. In fact I submit you actually don't want this anyway. But per above, it all depends on how you shape the review process itself. A development workflow that requires peer review of all commits allows for an opportunity to flag potential commits that require a higher level review, as well as common sense review of commit messages. One option here is to use a combination of private and public repos so you can do your code review in the private repo then pull to public when it passes review.

Also, you get better code.

This whole email was an *excellent* response, and I'm not (and haven't ever been) a government employee, so I can't truly speak to the poster's circumstances, take it with grains of salt, etc.

However, this particular answer is mixing a bunch of things together - you should always be able to arrive at a place where developers push directly to Github, without every commit being reviewed.

* Peer code reviews are A+, but a huge fundament of Github's entire pull request workflow is to do those code reviews *inside Github*. Individual developers can perform work on their own branches or forks, and discuss that work before merging upstream. That's much different than not allowing code to appear publicly at all without review.

* The benefits of code reviews by development team peers and product managers are different from the kinds of security reviews the poster is talking about. Arguing for peer code reviews is a good thing, and may lighten the burden on the security team, especially if metrics get established like Leigh's talking about. But it is separate from a decision on whether developers push directly to Github.

That's all, I agreed with the general thrust of the answer -- it just seems very grim to say that government agencies should not allow their developers to push directly to Github as a blanket statement.

One final general statement: working in public, treating it like breathing, changes human behavior in all sorts of ways, and in my experience, all for the better. It'd be difficult for me now to work any other way.

-- Eric

2. What to say to those who don't want to expose our messy coding practices?

A. Say "let’s stop using messy coding practices." As you mentioned a couple times, if you separate the problem of existing code from new code, this becomes a bit more realistic. If you establish in the culture that you're building for open source, even with code that will never be part of a public repo, you essentially challenge your engineers to develop and indoctrinate the kinds of practices that no longer raise those types of concerns. As Ben points out in the post I just linked, there's also a level of self-awareness that occurs— people who KNOW their code is going to be open source write code differently.

And again, you get better code. And,

B. Accept that on some level, all code is always messier than you'd like it to be. One of the core concepts of open-source development is transparency— it's as much about community-building as it is about the code itself— and with that comes a level of intellectual honesty. In that context, trying to present the world with fully polished code and perfect coding practices and hiding away the warts is the open-source equivalent of astroturfing.

3. Won't it be a lot more work to support community interaction?

For this one, it's really as much or as little as you want it to be. Obviously more begets more, and less begets less. But one viable strategy is to time box it. Basically have one engineer for a set period of time, say four or eight hours per week, triage the issue queues and pull requests. If you rotate that responsibility weekly across your dev team, the impact ends up being surprisingly low. Theoretically, if your engagement is effective this time box will need to grow over time, but hopefully by then not only will your leadership be more bought in to it, but you’ll also have a better sense of the value added to your own programs by the community with which you’re engaging, and have a solid footing on which to base any cost-benefit tradeoffs.

4. Isn't it okay (good, even) to put unfinished / unpolished code out in the open?

Yes. See response to #2 above. This resistance is always present initially, especially when dealing with legacy code. Engineers always want more time to "polish the code" before making it public: "I don't want my name associated with this crappy code," "this is embarrassing and will make us look bad," etc. etc. But let me ask you, have you ever met a developer who said "this piece code is exactly how I want it"? J As I mentioned in my intro There's a point at which you just have to take the leap of faith. There will always be detractors, no matter how good the code is, and that simply can't be helped. But at the end of the day the community can tell the difference between honest engagement vs. "checking the open source box," and it's been my experience that the former invites allies and the latter invites detractors.

Also, one of the other core concepts is that open source means that you get help with your code. If you're only publishing polished code you're actually denying yourself one of the biggest benefits, so why bother in the first place?

To sum up, there are two separate levels of concern here. First is the institutional fear, and I think the only way to overcome it is to embrace that fear by engaging with those concerned, and welcoming a level of oversight that both allays those fears and gets you better code, and if you implement that oversight right, it will be less burdensome than you expect, and will also lighten over time as leadership becomes more comfortable with the notion. The second is engineering fear – the "I don't want my bad code out there" argument – which I think you tackle by challenging the engineers to evolve. If the current processes are too messy for their standards, then you can engage with them and challenge them to come up with, and adhere to new ones.

I guess what I’m saying is that the cultural changes cannot be achieved simply in the context of a desire for an open-source culture, but that they need to be part of a larger holistic shift in how you develop code. But then I’m probably stating the obvious there.

Anyway, I know this was long, but I hope it helps, and I’d be happy to talk more, if only I knew who you were…

-L

--

Developer | sunlightfoundation.com

Heyman, Leigh

unread,

Feb 18, 2014, 11:56:31 PM2/18/14

to Eric Mill, Anony Moose, us-govern...@googlegroups.com

From: Eric Mill [mailto:er...@sunlightfoundation.com]
Sent: Tuesday, February 18, 2014 11:15 PM
To: Heyman, Leigh
Cc: Anony Moose; us-govern...@googlegroups.com
Subject: Re: Help with making the case for opening up software development

On Tue, Feb 18, 2014 at 10:52 PM, Heyman, Leigh <Leigh_...@oa.eop.gov> wrote:

1. Can developers be allowed to push directly to GitHub, without every commit going through a review process?

No. In fact I submit you actually don't want this anyway. But per above, it all depends on how you shape the review process itself. A development workflow that requires peer review of all commits allows for an opportunity to flag potential commits that require a higher level review, as well as common sense review of commit messages. One option here is to use a combination of private and public repos so you can do your code review in the private repo then pull to public when it passes review.

Also, you get better code.

This whole email was an *excellent* response, and I'm not (and haven't ever been) a government employee, so I can't truly speak to the poster's circumstances, take it with grains of salt, etc.

Thanks!

However, this particular answer is mixing a bunch of things together - you should always be able to arrive at a place where developers push directly to Github, without every commit being reviewed.

* Peer code reviews are A+, but a huge fundament of Github's entire pull request workflow is to do those code reviews *inside Github*. Individual developers can perform work on their own branches or forks, and discuss that work before merging upstream. That's much different than not allowing code to appear publicly at all without review.

That's all, I agreed with the general thrust of the answer -- it just seems very grim to say that government agencies should not allow their developers to push directly to Github as a blanket statement.

Right, sorry, I agree completely, I should have been more explicit-- in fact I¹m kicking myself a little here because I had an earlier draft that was more specific about how I meant this but it was already getting too long. The version that I cut stated specifically that you don¹t want engineers pushing directly to your org¹s public/production version of a repo without review. That’s more or less what I was getting at with the recommendation of devs working on forks of a private repo before pulling to a public one. I am definitely not recommending against leveraging github¹s workflows.

One final general statement: working in public, treating it like breathing, changes human behavior in all sorts of ways, and in my experience, all for the better. It'd be difficult for me now to work any other way.

Very well put!

Tony Pujals

unread,

Feb 19, 2014, 1:31:16 AM2/19/14

to Heyman, Leigh, Eric Mill, Anony Moose, us-govern...@googlegroups.com

1. Can developers be allowed to push directly to GitHub, without every commit going through a review process?

No. In fact I submit you actually don't want this anyway. But per above, it all depends on how you shape the review process itself. A development workflow that requires peer review of all commits allows for an opportunity to flag potential commits that require a higher level review, as well as common sense review of commit messages. One option here is to use a combination of private and public repos so you can do your code review in the private repo then pull to public when it passes review.

Also, you get better code.

This whole email was an *excellent* response, and I'm not (and haven't ever been) a government employee, so I can't truly speak to the poster's circumstances, take it with grains of salt, etc.

Thanks!

Actually, yes -- for both individual and organization repos, you can allow other developers to be collaborators. Simply go into the repo settings (right sidebar), and then select Collaborators (left sidebar).

This works great with small, cohesive teams, especially with effective branching strategies.

However, as far as letting the rest of the world make contributions, you really do want to stick with the fork & pull model. External developers interested in making contributions simply fork your repo; when they want to contribute their updates, they submit a pull request, which gives you the opportunity to review the differences and even enter into a running discussion with the contributor to clarify various differences before you choose to accept and merge them.

See this link:

https://help.github.com/articles/using-pull-requests

2. What to say to those who don't want to expose our messy coding practices?

I can't do this topic justice in a few words; I don't want to offend those who have found themselves in the situation where they are too embarrassed to let people see their work quality. But I would strongly urge the team to consider starting new projects (however small at first) in a way that they can feel good about sharing publicly and invite contributions from others. It's actually quite a satisfying feeling when someone notes something that can be improved and submits a pull request for your approval; it's nice to know others are interested in helping out and flattering for the contributor to have the pull request get accepted. This is the spirit of the open source community that fuels so much innovation. Embrace it.

3. Won't it be a lot more work to support community interaction?

It can be. More realistically, do you anticipate the project to be so popular that community interaction will be an issue? This is actually a great position to be in, if you get there. Take an extremely popular project like Bootstrap with two core developers (@mdo (https://github.com/mdo) and fat (https://github.com/fat)) and around 500 total contributors. The reward is more feedback and greater quality and ultimately a successful project:

http://getbootstrap.com/

https://github.com/twbs/bootstrap/issues

https://github.com/twbs/bootstrap/pulls (over 23,000 forks)

4. Isn't it okay (good, even) to put unfinished / unpolished code out in the open?

Absolutely. You want feedback and improvements early. This is the hardest thing for developers not used to the open source community to get used to. You'll find that a lot of the embarrassment goes away for rough code when you at least commit a reasonable number of unit tests with it and a decent readme.

The main concern is with security. Now, I understand that we can't expose our internal infrastructure, such as server names and directory paths (not to mention credentials).

You should not check in these types of dependencies. Spell out what the dependencies are so that everyone can set up their own environment. See this link about configuration for Twelve-Factor apps:

http://12factor.net/config

Environment variables can contain the values or the paths to the data necessary to get an application correctly configured and provisioned with credentials, etc. As the link explains, configuration files can be used, but it's easy and tempting to pass them around and possibly check into version control accidentally.

Bottom line, it's great that you championing efforts to increase productivity and quality through change and transparency in your software culture. Best of luck!

Tony

@subfuzion

Noah Kunin

unread,

Feb 19, 2014, 11:44:25 AM2/19/14

to us-govern...@googlegroups.com, Heyman, Leigh, Eric Mill, Anony Moose

Related note: I'm going to start putting together a guide for gov's to use DVCS. Beyond the two big players (GitHub / BitBucket) any other suggestions of system to evaluate? I'd like the guidance to be platform agnostic and widely applicable.

James Stewart

unread,

Feb 19, 2014, 12:12:51 PM2/19/14

to us-govern...@googlegroups.com

It can be useful to distinguish a choice of DVCS (eg. git) from the collaboration tools you might use with it and/or places where you expose that code (eg. github)

We wrote a very basic piece on version control for the UK Government Service Design Manual - https://www.gov.uk/service-manual/making-software/version-control.html because the really important thing for us was that teams think about and use version control, but we want the specific choices to be the team's choice.

I've since supplemented that with a blog post (that may well be of interest to the original poster) about how our teams within GDS have chosen to use git and how we use github - https://gdstechnology.blog.gov.uk/2014/01/27/how-we-use-github/

James.

Gray Brooks

unread,

Feb 19, 2014, 1:35:47 PM2/19/14

to James Stewart, <us-government-apis@googlegroups.com>

Thanks, James. It's awesome that you posted those. +1 to the other feedback already given.

A few more thoughts, too.

A) I definitely would echo the suggestion of starting small with some negligible projects that no one could have any real concern about. In my experience, getting your agency to dip its toe in the water is a great way of smoothing the path to where you're trying to get.

B) Ben Balter's done some really good writing directly focused on these topics. I recommend them all if you have a chance.

C) Also be sure to dive into these two:

D) There's also now a Government Open Source listserve that's worth hitting up in the future as well. I already told them about this thread.

E) For specific examples of what you're looking for, there are a number of projects that you can often detect through # of forks and/or activity. Check out these two mashups.

Along the lines of Leigh's points about anonymity, feel free to directly get in touch and I'm happy to give more specific examples directly.

Gray

------------------------------

c - 205.541.2245

Sr. API Strategist

Digital Services Innovation Center - GSA

* HowTo.gov/api

* US Government APIs listserve

Anony Moose

unread,

Feb 20, 2014, 12:56:23 AM2/20/14

to gray....@gsa.gov, James Stewart, <us-government-apis@googlegroups.com>

Hi,

Thanks for all your responses. Sorry for staying anonymous; no one appreciates the irony more than I do. I feel like I've been fighting a real uphill battle, and have in the past tried to push the envelope, and did more harm than good, because, for example, by carelessly exposing an internal server name on a GitHub repo, now our security folks have an example of why they need to keep things locked down. So, it's not that I'm afraid of looking bad, but I'm afraid that by criticizing my employer in public, it will make my goal that much harder to achieve.

Leigh, regarding code reviews, yes, I am all in favor of those, and the more the better. But, as Eric described, I was talking about the ability to use GitHub as the main development repository, independent of the specific workflow that's established for a project. So, yes, devs should work in a branch, and code should be reviewed before merged to master, say. But unless we're going to have a mix of private/public repos, then the branches and the pull requests should happen on the same public repository. But that is what our security folks are saying that we cannot allow. I didn't see anything in your response to suggest that, from a security standpoint, every developer commit must be reviewed before being pushed.

I really liked your answer to my other points, and thanks also to the others who responded. There were a lot of good points, and I will incorporate them into my presentation.

Gray, nice mashups!

Thanks very much.

Bill Shelton

unread,

Feb 20, 2014, 8:06:43 AM2/20/14

to us-govern...@googlegroups.com, gray....@gsa.gov, James Stewart

Thanks to all the posters. There's some great collective data here!

One thing I know government security teams care about is the ability to satisfy federal audits. Audits can be very rigorous, authoritative, and are key measures with many CISOs. Audits involve presenting artifacts surrounding controls as defined by NIST 800-53. Well defined processes and procedures are critical pieces to answering audits. These need to be in writing *and* you need to demonstrate that they are followed. In other words, you need to align what's done in practice with documented policies and procedures.

With respect to source code, there are a couple of tactical approaches:

(1) communicate and demonstrate that source code in of itself is *not* executing software in a production environment (e.g, if hosted on GitHub)

(2) stand on the shoulder of giants--look at the stellar OSS work of the DoD--http://dodcio.defense.gov/OpenSourceSoftwareFAQ.aspx, who formally defined source code as "data", which may dictate that it can be handled differently

(4) Conversations are important, but you need concrete artifacts: I started a generalized template that might help: http://if.io/open-source-program-template/index.html. It has some policies and procedures that security teams appreciate. It also has an open source policy, a checklist, and a link to a proof-of-concept tool that inspects git repositories for string patterns that shouldn't be there (PII or personal identifiable information): https://github.com/virtix/clouseau (We're actually thinking about making this a plugin for Travis and have implemented a simple post-commit hook that blocks commits when they contain certain strings. These are the kinds of demonstrable controls that help with adoption.)

(5) Start small. Get buy-in to pilot something small and very low risk.

On the subject of anonymity, it should be a matter of principle and ethics—if you're not willing to be accountable for something you say or write, then you're either not ready to say it, or, perhaps, it shouldn't be said. The amount of change you can make is directly proportional to your willingness to deal with friction and heartbreak.

-bill

Anony Moose

unread,

Feb 20, 2014, 8:29:36 AM2/20/14

to Bill Shelton, <us-government-apis@googlegroups.com>, gray....@gsa.gov, James Stewart

On Thu, Feb 20, 2014 at 8:06 AM, Bill Shelton <vir...@gmail.com> wrote:

Thanks to all the posters. There's some great collective data here!

One thing I know government security teams care about is the ability to satisfy federal audits. Audits can be very rigorous, authoritative, and are key measures with many CISOs. Audits involve presenting artifacts surrounding controls as defined by NIST 800-53. Well defined processes and procedures are critical pieces to answering audits. These need to be in writing *and* you need to demonstrate that they are followed. In other words, you need to align what's done in practice with documented policies and procedures.

With respect to source code, there are a couple of tactical approaches:
(1) communicate and demonstrate that source code in of itself is *not* executing software in a production environment (e.g, if hosted on GitHub)

(2) stand on the shoulder of giants--look at the stellar OSS work of the DoD--http://dodcio.defense.gov/OpenSourceSoftwareFAQ.aspx, who formally defined source code as "data", which may dictate that it can be handled differently

(4) Conversations are important, but you need concrete artifacts: I started a generalized template that might help: http://if.io/open-source-program-template/index.html. It has some policies and procedures that security teams appreciate. It also has an open source policy, a checklist, and a link to a proof-of-concept tool that inspects git repositories for string patterns that shouldn't be there (PII or personal identifiable information): https://github.com/virtix/clouseau (We're actually thinking about making this a plugin for Travis and have implemented a simple post-commit hook that blocks commits when they contain certain strings. These are the kinds of demonstrable controls that help with adoption.)

(5) Start small. Get buy-in to pilot something small and very low risk.

These are outstanding, Bill, thanks. Anything that can ease the minds, and make the lives easier, of the security folks is very valuable.

On the subject of anonymity, it should be a matter of principle and ethics—if you're not willing to be accountable for something you say or write, then you're either not ready to say it, or, perhaps, it shouldn't be said. The amount of change you can make is directly proportional to your willingness to deal with friction and heartbreak.

Sorry, I really have to take issue with this. If I had lambasted my organization *by name*, writing anonymously, that would be one thing. In this case, all I did was post some questions, which for all intents and purposes could be considered hypotheticals. I don't see where I crossed any ethical boundaries -- and the results have been great. Lots of good feedback and discussion, and these suggestions will be here in the list archives, for others to benefit from. I feel pretty certain that my situation is not unique. And please don't talk to me about friction and heartbreak -- believe me when I tell you I am at my wit's end.

-bill

Eric Mill

unread,

Feb 20, 2014, 11:22:53 AM2/20/14

to Anony Moose, Gray Brooks, James Stewart, <us-government-apis@googlegroups.com>

On Thu, Feb 20, 2014 at 12:56 AM, Anony Moose <feicha...@gmail.com> wrote:

Thanks for all your responses. Sorry for staying anonymous; no one appreciates the irony more than I do. I feel like I've been fighting a real uphill battle, and have in the past tried to push the envelope, and did more harm than good, because, for example, by carelessly exposing an internal server name on a GitHub repo, now our security folks have an example of why they need to keep things locked down.

It's weird - even outside the government, in a fairly freewheeling technical environment, our sysadmin gets antsy about doing things like exposing internal server names in our code. I don't know why security folks have this mindset - if someone breaches the wall, they're going to find out everything they need to know.

I've started publicly versioning as much system stuff as I can - for one of our APIs, I have our deployment script and even our nginx configuration in our public repository. It makes me feel *more secure* to do this - it's an affirmation to myself that our system can withstand scrutiny, and anyone who wants to suggest a contribution to make it more secure can easily do so.

I know this is not the battle you're fighting, it's just a mindset that is not unique to government, and baffles me. Truly secure systems -- from customer service practices, to password hashing algorithms, to server configuration -- are more secure when their methods are publicly known.

-- Eric

--

Developer | sunlightfoundation.com

Heyman, Leigh

unread,

Feb 20, 2014, 11:40:07 AM2/20/14

to Anony Moose, Brooks, Gray, James Stewart, <us-government-apis@googlegroups.com>

On 2/20/14 12:56 AM, "Anony Moose" <feicha...@gmail.com> wrote:

But unless we're going to have a mix of private/public repos, then the branches and the pull requests should happen on the same public repository.

So why not use a mix of public/private? It let's your engineers develop the kind of workflows and coding practices that will get you better code even if it never ends up in a public repo; it lets you build workflows that integrate security checks/reviews in to them. It let's you partner with the security team to collaborate on a set of workflows that they will (eventually) approve of. Open source isn't an all-or-nothing proposition.

But that is what our security folks are saying that we cannot allow. I didn't see anything in your response to suggest that, from a security standpoint, every developer commit must be reviewed before being pushed.

Why treat security reviews different from code reviews? I mean peer reviews should include security best practices anyway right?

I guess what I'm saying— again at the risk of stating the obvious — is that the most effective way to overcome this kind of cultural resistance is to engage and collaborate with those resisting. They can't reasonably reject a set of policies they helped draft.

James Stewart

unread,

Feb 21, 2014, 7:46:53 AM2/21/14

to <us-government-apis@googlegroups.com>

On 20 February 2014 16:40, Heyman, Leigh <Leigh_...@oa.eop.gov> wrote:

On 2/20/14 12:56 AM, "Anony Moose" <feicha...@gmail.com> wrote:

But unless we're going to have a mix of private/public repos, then the branches and the pull requests should happen on the same public repository.

So why not use a mix of public/private? It let's your engineers develop the kind of workflows and coding practices that will get you better code even if it never ends up in a public repo; it lets you build workflows that integrate security checks/reviews in to them. It let's you partner with the security team to collaborate on a set of workflows that they will (eventually) approve of. Open source isn't an all-or-nothing proposition.

Absolutely - very few people operating at any scale have absolutely every detail out in public, or only deploy from public repositories. To my mind the important thing is to default to open, but to also build up a mature enough delivery culture that you can consider security risks as part and parcel of every decision you're making.

We've found that there are a mixture of approaches captured under the label "security team" but that generally the teams we've met around government are risk management specialists rather than security specialists per se. They're not the people who will run penetration tests, they're the ones who'll tell you that it's important to have them. I'm not sure if that's the case across the US government too, but I wouldn't be surprised.

What's worked well there is to really engage in those conversations about risk. The risks inherent in any system are a really important consideration when designing it, and the worst problems come about when they're dealt with independently of the design and development, or tacked on at the end. For example, rather than having to put incredibly complex defences around a database that holds all your data, perhaps you could have partitioned that data early on? That's often only practical to do when the risks are part of the planning process (there's a bit more on that at https://gds.blog.gov.uk/2014/02/10/striking-a-balance-between-security-and-usability/ )

There's a really good document about considering risks in source code management that was produced by the arm of our security services responsible for information assurance. Unfortunately it's currently restricted so I can't share it, but it was really helpful that they articulated their concerns as it gave us a chance to respond to them in detail. For some very sensitive systems there are legitimate concerns around even exposing the identities of the developers working on the code, but the core one for us was about early detection of attacks.

The argument goes that if the design of a system is closed then an attacker will need to do extensive probing and testing before they're able to complete a successful attack. If you're monitoring that probing you'll get an early warning, and (hopefully) be able to take counter-measures. If you went as far as to produce an appliance version of your code that an attacker could set up in their own environment then they could do their preparation secretly and you wouldn't get an early warning. In most cases you can respond to that with an argument about defence in depth - your library code or your application code may be open, but your firewall details, your network config, any filters you use, and so on may be kept private. At that point it becomes a conversation about proportionality and risk vs. opportunity.

There's a real opportunity to raise the level of the conversation around security in most contexts. When you have passionate developers they'll care about the context of the system, the risks to it and the best ways to build in security. And when that happens you can get well beyond the traditional approaches of throwing things over fences for review or testing and towards ideas like making security testing part of your continuous integration process.

James.

Heyman, Leigh

unread,

Feb 21, 2014, 9:57:50 AM2/21/14

to James Stewart, <us-government-apis@googlegroups.com>

James that was a fantastic response! I very little meaningful feedback except to say thank you for giving me something to read that made a great start to my work day!

I couldn't agree more your points about baking security and risk-management considerations early and often in to the software planning and development layers, rather than tacking it on at the end (this concept holds in things like UX and interface design as well— risk always ensues when they're an afterthought).

Thank you!

-L

Ben Balter

unread,

Mar 19, 2014, 12:51:34 PM3/19/14

to feicha...@gmail.com, <us-government-apis@googlegroups.com>

A bit slow to respond, but a great conversation and lots of great advice here from Leigh and James, among others.

every change-set would have to get reviewed before being pushed to public

Delay code review not to when it hits the open source master branch (which doesn’t impact the organization’s security posture) but before it hits production (where it’s no longer an abstract project, but now production code). As you do with modules within the software, separate the organization-specific concerns from the code itself. Abstraction is both a best practice and your friend here.

So why not use a mix of public/private? It let’s your engineers develop the kind of workflows and coding practices that will get you better code even if it never ends up in a public repo; it lets you build workflows that integrate security checks/reviews in to them. It let’s you partner with the security team to collaborate on a set of workflows that they will (eventually) approve of. Open source isn’t an all-or-nothing proposition.

Whatever route you take, it’s important to realize that the technology’s the easy part. A move like this is simply leveraging technology as an excuse to motive cultural change. It’s a different workflow and a different power dynamic than many primarily proprietary organizations are accustomed to. Starting small and privately, with something insignificant, just to go through the motions and find the friction points between collaborative development and your organization’s culture is hugely valuable in and of itself. Make a repo with the best places to get lunch near by, or a style guide, or other non-code thing to get a feel for how things works before starting the conversation to go public. Baby steps.

I was talking about the ability to use GitHub as the main development repository, independent of the specific workflow that’s established for a project. So, yes, devs should work in a branch, and code should be reviewed before merged to master, say. But unless we’re going to have a mix of private/public repos, then the branches and the pull requests should happen on the same public repository.

One word of caution, is that having an internal private repository (or other version control workflow), and an external public repository, is not open source, and can be an especially bad experience if you have an internal bug tracker to which the open source community is not privy. How’d you like it if you dedicated 6 hours of your time to add a feature or submit a bug only to hear “oh yeah, we’ve been secretly working on that. We’ll push it directly to master next week”. You’d never contribute again. Likewise, you’d want more eyes on your code review, as when you open source your project, you expand the nexus of stakeholders to beyond those within your organization. An imbalance of information between sides of the firewall is one of the best ways to ensure an open source project fails.

OP, more than glad to chat confidentially if you have questions about specifics / logistics of using GitHub. Feel free to reach out to gover...@github.com any time.

Cheers and open source,
- Ben

Charles Worthington

unread,

Mar 21, 2014, 11:30:27 AM3/21/14

to us-govern...@googlegroups.com, feicha...@gmail.com

Thanks for the great discussion in this thread.

I am interesting in exploring the specific use case of developing a specific site or application in the open. This is somewhat different than traditional open source development, which is mostly focused on the development of generic, reusable libraries that are not applications in themselves but are instead reusable components that can be used to create a specific application. Of course, there are some notable exceptions like the Firefox web browser (really all of Mozilla's stuff). There are also some "in between" things where the open source software is a fully functioning application, but can be easily modified and re-purposed because it addresses a very common use case and is designed to be modified and configured (i.e. Wordpress, Discourse, and CKAN).

It is much less common to see a specific application with specific and unique business requirements developed in the open. Do people have thoughts on why this might be? I have a few ideas:

The utility of a specific application to the broader community is low because it is not as easily reusable than a software specifically designed and packaged to be incorporated into other applications, or used for multiple use cases.
Because of the above, the likelihood of contributions to a specific application is low (in my experience most open source contributions come from people who are using the library in their own applications).
There may be productivity downsides to developing a specific application "in the open." Every commit checked into the repository on every development branch (even if just temporary / dummy code) is a potential security vulnerability. Who among of us haven't hard-coded an API key or other piece of sensitive information when we're in the very early stages of just getting the thing to work / evaluating a new software package?) Beyond security/PII issues, making every commit public potentially opens the development process up to political scrutiny in a way that I could easily see becoming burdensome/unproductive for certain projects (this would include not just the actual code but all the communication around the code that typically happens inside a repo). Techniques to mitigate this would seem to introduce friction that reduce the benefits of developing in the open (i.e maintaining a "public" repo that gets more scrutiny separate from the "production" repo).
Unless the whole stack is open source, it may not be possible to make the development repo public anyways due to copyright issues.
In most cases, an actual application is proprietary while the reusable components are not (though this obviously doesn't apply to most government software)

In my mind, the benefits of developing an app in the open are:

Open up possibility of receiving useful contributions from the public Increase accountability / quality from developers and designers working on the project (though this has downsides mentioned above).
Increase accountability of management to choose good projects (i.e. why are you building Software A? Agency XXX already built Software B to do this same thing here is a link to the repo)
Though resuability of a specific application is low, it is definitely not zero because it's still valuable to see the code behind a real-world implementation of an application. And since many government applications address similar "business problems", the chance for specific application reuse may be higher in government than in other sectors.

I am curious if people feel as ambivalent about this decision as me. Do we definitely think developing applications in the open is a best practice? Should all government software be developed in the open by default? Or is it more nuanced than that?

Are there examples (government or otherwise) of specific applications developed in the open? The only one that comes to mind for me is Data.gov. There must be others?

Charles Worthington

unread,

Mar 21, 2014, 12:26:17 PM3/21/14

to us-govern...@googlegroups.com, feicha...@gmail.com

I posed this same question to the Government Github group, and have received some great responses. Worth checking out if you're interested: https://github.com/government/best-practices/issues/32

Noah Kunin

unread,

Mar 25, 2014, 9:47:41 PM3/25/14

to us-govern...@googlegroups.com, feicha...@gmail.com

I won't add much (since there's already so much win here) but if you're familiar with the new-ish NIST 800-53 rev 4 and want to talk about how "overlays" [not my term but whatever, let's roll with it] can help in Ted in this regard, please drop me a line or somehow make yourself known.

And if you really want to dedicate yourself, read up on the Alliant GWAC and its successor for context on where said overlays could [fingers crossed, will] plug-in: www.gsa.gov/alliant

If all this is Greek to you, give me a weekend (or two) to translate it.

Anony Moose

unread,

Mar 29, 2014, 11:17:38 PM3/29/14

to Noah Kunin, <us-government-apis@googlegroups.com>

Thanks for all the great feedback. Sorry it has taken me so long to respond.

James wrote:

> There's a really good document about considering risks in source code management
> that was produced by the arm of our security services responsible for information
> assurance. Unfortunately it's currently restricted so I can't share it

That's ironic!

> The argument goes that if the design of a system is closed then an attacker will need
> to do extensive probing and testing before they're able to complete a successful attack

Yes, I think this argument is a good one. And I do believe that "security by obscurity"
does have merit, because it incrementally raises the bar to those who would launch an
attack, and since attacks are often opportunistic, it thereby lowers the risk. The issue
comes down to is risk-vs-rewards. I want to make the case that in some instances
(for example, where we really are developing a library that is meant to be deployed and
used outside) that the risk is negligible, and the potential rewards are high. The trouble
I've had is that some people seem to have the mindset that any risk whatsoever is
unacceptable (and security teams seem to attract this kind of person, wouldn't you agree?)

Ben wrote:

> One word of caution, is that having an internal private repository (or other
> version control workflow), and an external public repository, is not open
> source, and can be an especially bad experience if you have an internal bug
> tracker to which the open source community is not privy. How’d you like it
> if you dedicated 6 hours of your time to add a feature or submit a bug only
> to hear “oh yeah, we’ve been secretly working on that. We’ll push it directly
> to master next week”

Ben, this is a really good point, and I will use it.

Charles, thanks for starting the thread on "best practices". I see that there were
a lot of great responses there, too.

Noah wrote:

> I won't add much (since there's already so much win here) but if you're familiar
> with the new-ish NIST 800-53 rev 4 and want to talk about how "overlays" [not
> my term but whatever, let's roll with it] can help in Ted in this regard, please

> drop me a line or somehow make yourself known.... If all this is Greek to you,

> give me a weekend (or two) to translate it.

Not quite Greek -- maybe Spanish (and I don't speak Spanish). It is nice to know that
this mechanism of customizing security systems exists, and I will pass it along to our
security team, along with your contact info, if that is okay.

Thanks again!

Noah Kunin

unread,

Mar 30, 2014, 6:36:32 PM3/30/14

to Anony Moose, <us-government-apis@googlegroups.com>

Yep, passing along noah....@gsa.gov is fine!

I have extensive material and thoughts on the security/obscurity debate as well. If all adversaries were automatic scripts, security would be (mildly) increased by obscurity. But the most dangerous adversaries are all human minds, and as a result, security through obscurity results in less net security.

Federal systems should remain secure even if the source code is exposed or has been generated by something like IDA. You always assume the source code is out there. The way you keep your developers honest and ensure they don't rely on security by obscurity to remediate vulnerabilities is by open sourcing that same code. Not only does that force your team to a higher standard, but you gain the benefit of other perspectives evaluating your code for potential issues. Since there are always more collaborators than adversaries, the net security improvement is always greater. So not only are there significant incentives to open source, but to also actively support a community around your software.

If your security posture is materially improved by code obscurity, then it logically follows your security would be materially harmed by the code be circulated. Setting up a "control" that can be defeated via copy + paste + email isn't worth setting up in the first place. :)

Applications that cannot remain resilient in production, even with known source code, should not be in production. Asking CISO teams to open source isn't asking them to change their game, but to raise it.