Metacat 3.0.0 Major Release

0 views
Skip to first unread message

Jing Tao

unread,
May 14, 2024, 5:34:32 PM5/14/24
to ESS-DIVE, eml-dev, devel...@dataone.org, morph...@ecoinformatics.org, Rani Gaddam, yuuri...@db.soc.i.kyoto-u.ac.jp, secre...@jalter.org, Metacat Dev List

We are pleased to announce a new major release of Metacat (3.0.0), a turnkey data repository software platform used across the earth science community. In summary:


  • This release marks a major milestone towards support for the increasingly large datasets being collected by the scientific community. In order to enable this focus on horizontal scaling and large dataset support, several breaking changes from previous Metacat versions have been necessary. (See “Summary of breaking changes”, below). 





  • Metacat can also be deployed on Kubernetes, using a Helm chart (note this is a beta feature). See the helm README 


New Features

Here is a selection of new features in this release. Consult the Release Notes for the full list:

  • Storage and Indexing Enhancements to handle large datasets, including increased indexing speeds by replacing hazelcast with a new independent dataone-indexer component

  • Kubernetes deployment, using a Helm chart, to allow multiple dataone-indexer instances to be deployed in parallel for horizontal scaling. (Note this is a beta feature. It has been tested, and we believe it to be working well, but it has not yet been used in production - so we recommend caution with this early release. If you try it, we'd love to hear your feedback!)

  • Upgrade from Java 8 to 17

  • Metacat Administrator Interface now uses ORCID identifiers for login authentication and authorization. Admin users now also have Member Node admin privileges, allowing viewing and manipulation of private datasets

  • Metacat now verifies that essential components are available and configuration is correct, before startup. Messaging to assist with debugging of startup issues can be found in the metacat logs (e.g. 'tomcat/logs/catalina.out') and host logs (e.g. 'tomcat/logs/hostname(data).log').

  • Properties saved from the Metacat admin configuration pages are now saved in 'metacat-site.properties' outside the metacat installation directory, so settings will not be lost during future upgrades after 3.0.0.

  • Updated component compatibility: now works with the following versions: Tomcat 9; Solr 8.8.2 to 9.5.0; PostgreSQL 14.x to 16.x; and the latest release of RabbitMQ

Breaking changes:

  • The original Metacat API that we deprecated several years ago has now been fully removed and is no longer supported. Morpho and any other clients depending on this API will no longer work. Client access is available only via the DataONE API.

  • A valid admin (auth) token is required for a Metacat 3.0.0 installation to function correctly (i.e. to handle private datasets). This is only an interim requirement; a future release of Metacat will remove the need for this token.

  • LTER and OAI-PMH harvesters have been removed. Please see the Metacat administrator's guide for details.

  • For theming, skin-based deployments are no longer supported, and all sites should migrate to MetacatUI.

  • Metacat admin authentication is no longer supported via LDAP or Password-based logins. An ORCID iD is now required in order to log in as a Metacat administrator.


About Metacat

Metacat (https://github.com/NCEAS/metacat) provides a standardized but customizable platform for preserving data and metadata in many formats. It helps scientists find, understand and effectively use data sets they manage. or that have been created by others. Hundreds of thousands of data sets are currently documented in a standardized way and stored in Metacat systems, enabling the scientific community to access a vast array of scientific data that can be easily searched, compared, merged, or adapted for other purposes due to its thorough and consistent description.


Metacat is compliant with the DataONE network, making it easy for organizations to participate in the global DataONE data federation, and integrates with ORCiD and DOI identifier systems. It also supports the customizable MetacatUI client-side application for searching and browsing data in Metacat and DataONE, creating an easily-deployed system with advanced search and discovery features pre-installed.


The open source Metacat system is maintained by NCEAS (https://nceas.ucsb.edu) and DataONE (https://dataone.org), and is used by repositories worldwide to manage data collections. We're an open community, collaborating on the shared development of this common data platform, to improve efficiency and sustainability in the open data ecosystem. We welcome all types of contributions, from feedback on features and bugs, to documentation, code, and everything in between. Please join us!


We hope that this software is useful to you. We welcome feedback and comments that will further improve the application. Please submit bugs and problems through our bug tracking system (https://github.com/NCEAS/metacat/issues) and send general feedback to 'metac...@ecoinformatics.org'.  


The Metacat Development Team



Reply all
Reply to author
Forward
0 new messages