Hi all,
I'm pleased to announce the release of a new CKAN extension focused in data quality. As part of the Frictionless Data [1] project, we at Open Knowledge International have been working on a new extension that integrates with goodtables [2], a powerful library for tabular data validation.
Here's a list of the main features:
* Perform automatic data validation in the background
* Validate files at upload time
* Dedicated report page listing issues found with the data
* Ability to provide a data schema for schema-based validation (eg the contents of this column should be a date, or between 1 and 100, etc)
* Report generation with summaries for all issues found in all files
You can get a better overview on this blog post published on
ckan.org:
Or if you want to dive right in, here's the extension repository with an extensive README documenting installatio, configuration and features:
Looking forward to any feedback and comments, we see lots of potential in this so we are keen to see how it might be used in novel ways!
Adrià