Job Posting: Lead Senior Research Data Management System Developer at Harvard

Skip to first unread message

Eleni Castro

Dec 17, 2015, 1:02:44 PM12/17/15
to Dataverse Users Community
Hi Dataverse Community,

There is an updated job opening for Lead Senior Research Data Management System Developer at Harvard Medical School, who will be working closely with the Dataverse Team at Harvard, so please share with anyone you think might be interested. Job announcement is below and feel free to get in touch with if you have any questions.

Thanks for your consideration and have a happy holiday season!



Lead Senior Research Data Management System Developer
Harvard Medical School
USA - MA - Boston
Information Technology
Biological Chemistry and Molecular Pharmacology
00 - Non Union, Exempt or Temporary
A joint project between the SBGrid Consortium at Harvard Medical School and the Dataverse Team at the Institute for Quantitative Social Science at Harvard University has an immediate opening for a lead developer to help us build a next generation data publication system for large biomedical datasets. We aim to make biomedical datasets publicly available through a federated data grid to facilitate access, citation, and data analysis by scientists. Our pilot collection includes datasets generated using X-ray crystallography, computer modeling, lattice light sheet microscopy, and microED diffraction. This collection is currently replicated to computing centers in the US, Europe, Asia, and South America. The project is supported by the Helmsley Charitable Trust and was recently selected as a pilot of the U.S. National Data Service. To learn more about the environment, please visit our current implementation at and our group websites at,, and

The lead developer will be responsible for successfully migrating our in-house research data management system, written in Python, to Dataverse ( after first extending Dataverse (with the full support of the Dataverse development team) to include the features necessary for the migration. The candidate will develop a final set of requirements based on the feedback and experience of the end-user community using our current pilot system. Examples of features that must be added to Dataverse include better support for large (~100 GB) datasets, automatic data validation pipelines, and other functionalities relevant to specific biomedical data types. The lead developer will also help to evaluate data transfer and upload and management technologies, such as Globus, that can integrate with Dataverse to support larger datasets and provide direct computing on the data. The developer will work with our team to ensure that all new functionality developed under this project is merged into the Dataverse open source project and shared with the community.

As a senior member of our team, this individual will also support training junior members, collaborate with collection specialists, and present outcomes of the project at meetings and conferences.
Bachelor's Degree in computer science or engineering and 5-8 years of strong programming experience is essential, preferably in Java and Python, ideally in the context of web applications.
Our team will welcome candidates with diverse technical backgrounds, but the successful candidate will have experience handling large datasets and leading software development projects. A working knowledge of Linux, shell scripting, databases, and distributed version control systems (git, mercurial, etc) is also necessary. The ideal candidate will also be familiar with data management software and the handling and analysis of large datasets.
This is a term appointment ending on September 30, 2018.

Reply all
Reply to author
0 new messages