A Query About Submitting a Pull Request for New Features

57 views

Skip to first unread message

Grant McNulty (500 Year Archive)

unread,

Sep 20, 2017, 2:27:36 AM9/20/17

to AtoM Users

Hello,

We have developed some new features that we think would be a worthwhile addition to AtoM's core. We have developed on v 2.3.1 but I understand that some of the plugins should be independently upgradeable as standalone components. The features include:

User registration: New users can register themselves, create short bios etc. Administrator receives notification of new user.
Comment and Upload: Once registered, users can comment on existing records. Comments are nest and include a a gravatar / bio for each user. Uploading is through an upload form, following which links to uploaded resources can immediately be posted in comments. All comments and uploads are held in a moderation queue before being published.
Enhanced Search Functionality: All search terms are highlighted in yellow in search. Highlighting of terms is passed through to the individual records (like archival descriptions and authority records) and to PDFs, where applicable. So, if you search for the term 'hat' it will be highlighted in search results, on the actual archival description / authority record etc. and anywhere that 'hat' might occur in a PDF (once you click through to it). If the original search term only appears in the PDF, it is still highlighted in yellow.

Some of these features have been developed as standalone plugins, while others (particularly the search feature) have required us to fork AtoM's core code.

I am just wondering about trying to incorporate these features into AtoM's core codebase. I understand that we can submit a pull request for Artefactual's coders to review our work, but who decides whether or not features should be included in AtoM's trunk codebase - is it the users of AtoM who might want these features or Artefactual as the lead developers? I understand that money and time are factors - there are costs for Artefactual to review and for our developers to make the necessary changes and both of these activities take time. How do we plan / budget for these?

Thank you,

Grant

Dan Gillean

unread,

Sep 20, 2017, 12:11:02 PM9/20/17

to ICA-AtoM Users

Hi Grant,

Thanks for sharing this! These features sound great. Thank you as well for the great questions. These type of queries come up fairly often, so I'm going to take some time in answering - apologies in advance for the length!

We've previously tried to answer some of these questions on the following wiki page, but based on some of our more recent experiences, this wiki page may require further revision. In any case, this is a good place to start:

https://wiki.accesstomemory.org/Development/Recommendations

As to the question of who decides what gets merged... from one perspective, you could say that it is us at Artefactual - we are the ones who have to maintain the new code through subsequent releases, ensure it works with other features, make sure any new strings can be translated, and who will have to write and maintain the documentation as well (though of course we welcome documentation pull requests as well and have basic wiki resources and slide deck on submitting small changes available), and answer forum questions about how it works and how it breaks. When users submit code to us, we have to invest a lot of unpaid time to review the code, suggest fixes and improvements, resolve issues, manage merge conflicts, testing, and more - and as a small company that gives away all of its core resources, we don't always have the capacity to take this on.

What it comes down to then, is community support for the features. If you - and/or a collection of users who are interested in seeing these features incorporated - are budgeting time and money to help sponsor code review, merging, and documentation, then the likelihood of seeing your code merged into a public release is increased dramatically. From this perspective, it is really our community who decides what gets prioritized enough to be included - just like any other feature. Ultimately, the process is not that different from the current model we outline for all development in our Development philosophy wiki page.

When evaluating a community-submitted feature for inclusion, here are some of the questions we ask:

Is there any related standard to this functionality, and if so, is the feature implemented in a standards-based way?
Is this a feature that many of our users have expressed interest in? Or just a small group?
Is the feature configurable - i.e. can users who do not want this functionality disable it, and/or configure it for their needs?
Has performance and scalability been considered in the implementation? Will this work equally as well for our large international users as it will for a small archives?
Is the code being submitted against the latest development branch - have the developers made sure they have resolved possible merge conflicts and were regularly rebasing against our development branch, or are we being expected to do that?
Does the code make use of existing methods, functions, and classes, or does it ignore existing patterns? Does it follow the recommendations in our coding standard?
Does the work support internationalization (i18n)? Are all strings wrapped in the i18n helper so they can be translated by our community?
Does the feature use existing project language where possible for a consistent user experience?
Are the licenses of any new libraries or dependencies compatible with AtoM's A-GPLv3 license?
Are any new libraries or dependencies well supported and well maintained?
Does this functionality introduce security risks? If so, how have these been addressed?
Does the code include comments that will help us (and future developers working with the code and maintaining the project) understand what is being done, why, and how?
Has any work been done to support accessibility (for example, the use of aria tags to support screenreaders, etc)?
Is the pull request broken up into logical, atomic commits so we can review each part independently and better understand the whole?
Are contributing developers responsive to feedback, available for questions, etc?
How much complexity will this add to the application, and how does that balance against the need/desire for the feature from our community? How much time will this add to our testing, our documentation process, our release packaging? Are there obvious challenges to long-term maintainability?

And so forth.

Ideally, these are questions that your team should be asking itself prior to submission - these are questions we try to consider in any development project we take on, learned through many humbling experiences over the years developing and maintaining the application. Not all questions must have perfect answers for it to be accepted, but we hope they might at least be considered. We're going to have to be able to answer them anyway if we take on the code - you are essentially asking us to take on all future work and be able to respond to the community when something doesn't work, so we want to be careful in how we approach code incorporation.

At this point, we are considering updating our policy so that any large pull requests must include budgeted resources for review, merging, testing, and documentation for us to be able to take on maintenance. We're simply unable to swallow these costs ourselves. Often in the past we have invested many hours into reviewing a feature, providing detailed feedback, and requesting changes, only to never hear from the submitter again. If you can demonstrate that the above questions have been addressed, this will dramatically reduce the amount of time that code review etc take us.

As to how to budget for that time, I admit this is trickier to answer. You can assume that any review process (which will include a developer, analyst, and possibly a sysadmin) will require a minimum of 6 hours of time - we have to inspect the code, install the branch, test the feature, look for regressions this might cause elsewhere, prep feedback, ask questions, etc. I'll consult our team for any other suggestions or rules of thumb I can pass on (perhaps re: number of commits, or overall size vs time, or how long it took to develop vs how long that might suggest it will take to review, etc) but one of our developers did say that size/complexity vs time is not a linear relationship - it's probably closer to an exponential one.

Note that these might be questions you can answer yourself somewhat in advance: do you have other team members perform code review when you develop features? How long did that take? How long did it take you to thoroughly test the feature? In terms of documentation, it again depends on the complexity of the feature, but a medium-sized feature is going to take me about 2 hours to document properly and deploy to the website.

In any case, the best way to proceed if you intend to share a feature and want us to take on its maintenance is to share very early. Send us a WIP draft pull request very early on - engage our team. If development is just starting then a) it take less time for us to look at what's there, and b) we can advise on the direction to ensure that some of the above questions are addressed and the likelihood of us accepting a PR later are increased. Because of this, it's much more likely we'll be able to comment as things develop without needing additional funding to do so. You can also consider making an early post in the user forum and see if other users are interested - maybe there is a simple change you can make that will gain support from another institution who would then be willing to help sponsor merging. Maybe there is a use case that neither you nor we will have considered, but will improve the feature for all.

In terms of the features you mention here - I'm sure there will be many users here interested in them! But there will also definitely be those who don't want them - for example, user registration is only useful if you expect public users to be logging in (for example, if they are also going to use the comment/upload functionality). The comment/upload functionality will definitely be of interest to many... but there will also be security, privacy, and scalability issues. I recall discussing this with you some time ago and sending you many use cases for consideration - were these somehow addressed in what you developed? My feeling is that probably all of these features should have at minimum on/off configuration settings, since our community uses the application in remarkably different ways and contexts.

If these conditions seem strict, or as if Artefactual has too much control - know that these are questions we are also considering as we look at the future of AtoM and explore new governance options. As outlined in the Development Philosophy wiki page, Artefactual was never intended to be the organizational home of the project - but setting up governance requires resources, and it requires the ability to be responsive to the community (having developers, project managers, and community analysts available, for example). The ICA was unable to commit to these things, which is how we ended up in the position we are in now. We remain very open to changing the organizational home of the application - ideally becoming just one development company among many contributing to the project! - so long as we are certain that the project will live on, and that any new governance model is capable and sustainable. We are exploring Foundation models for the future - but note that this could in many ways place more onus on community developers. For example, take a look at the Islandora code submission guidelines - they require that every feature submitted come with someone willing to take on at least the ongoing managerial maintenance of the feature, in addition to a number of other requirements. They also require that institutions pay a membership fee to have a say in what gets merged to the core.

Finally, one last thought on merging versus maintaining locally:

For features that not everyone in our community will use, it's not necessarily a bad thing, or a detriment to the community, to not have them in the public project. That is to say - you can still make the code available, write some documentation, and maintain it locally. We do maintain a wiki page of feature development we know about that hasn't yet been publicly merged, and we are always happy to add to this. If the feature is well designed and well-contained, and there's documentation for community users to know how to incorporate it, then it can be used - and this increases the likelihood of institutions sponsoring further development of it, or helping to sponsor its inclusion in the core public branch. To return to the Islandora example, they maintain a whole list of modules that are not merged in core - some because they are not stable, but others simply because not all users will want to use them or they don't entirely align with the core project focus/goals. A future version of AtoM, that is fully plugin/module based, could potentially follow a model like this more closely - it makes the core easier to maintain, shares some of the maintenance burden back with the community, and allows users more control over what they install.

If we do not end up accepting a pull request, or you are unable to help sponsor the review, merging and documentation of large feature submissions, we are always happy to list them on our wiki page and help promote them in any way we can. Hopefully that will help encourage interest, so you can collaborate with other users in either funding a review and merge, or maintaining them locally.

We're open to feedback on the above as well - so long as it addresses our very real limitations on resources. We want the project to be vibrant and comprised of many contributions - but we can't realistically do all the work required to maintain AtoM AND review/test/merge/document/support all pull requests without your help.

Regards,

Dan Gillean, MAS, MLIS
AtoM Program Manager
Artefactual Systems, Inc.
604-527-2056

@accesstomemory

--
You received this message because you are subscribed to the Google Groups "AtoM Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ica-atom-users+unsubscribe@googlegroups.com.
To post to this group, send email to ica-ato...@googlegroups.com.
Visit this group at https://groups.google.com/group/ica-atom-users.
To view this discussion on the web visit https://groups.google.com/d/msgid/ica-atom-users/6362ca96-14fb-463e-9382-069ee364cf9e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all

Reply to author

Forward

0 new messages