Non IT person looking for answers to general CKAN questions

115 views
Skip to first unread message

Rae Winborn

unread,
Sep 5, 2017, 2:33:47 PM9/5/17
to CKAN Global User Group (Non-technical questions)
Hi,

My company is looking to publish significant parts of our data externally on our website for our members (but not publicly), so they can download their own ‘cuts’ of our dataset. 

We are looking to create an API on our website that would allow our members to filter and download the data themselves directly from our website. We would ideally need something that would be embedded into our website (which is already password protected) that would allow our customers to filter their desired data, have that data displayed both graphically and the underlying data in a table, and then be able to download that cut of data as an excel or csv file. The data would need to be secure and private unless accessed by a customer and the customer would NOT need additional credentials (username or password) in order to access the data. 


I've come across CKan and it looks like it could be appropriate, but am finding the information on it a bit too technical for my understanding. I have a few questions:


  • Our current website uses wordpress. Can CKAN integrate with wordpress? Does it have to be installed on the same server as the wordpress site is?
  • Given CKAN deals with open data, is it appropriate for publishing data privately (to verified users)? How is the data protected?
  • How advanced are the data visualizations? Can you create pivot tables? Graph aggregate data? (I've tried to play around with some of the graphic visualizations from some data.gov datasets, but it's not super intuitive and it doesn't display aggregate data. For instance, for the higher education dataset-https://inventory.data.gov/dataset/032e19b4-5a90-41dc-83ff-6e4cd234f565/resource/38625c3d-5388-4c16-a30f-d105432553a4-It provides a list of all U.S. universities. I couldn't figure out how to create a graph the number of institutions by state.)
  • As someone without an IT background is there someone that can assist in actually coding/deploying CKAN? 

Thanks in advance for your help!

Florian May

unread,
Sep 5, 2017, 10:44:24 PM9/5/17
to ckan-global...@googlegroups.com
Hi Rae,

I'm a state government employee working with and on CKAN for my agency. We share many of your requirements and I've bumped my forehead many times on the problems you describe.

IMHO CKAN is definitely a viable option. Its API does all you have mentioned, and there are some good R and Python packages to facilitate access.

re website integration
e.g http://data.wa.gov.au/ is an example of a website with a simple redirect to CKAN. This keeps website and CKAN separate and is only a configuration thing at your internal reverse proxy.
Blog post on CKAN and Drupal integration by David Read (CKAN core team): https://data.blog.gov.uk/2012/09/14/integrating-ckan-and-drupal/

re limiting read access
CKAN is designed to host open data, so out of the box you'll get all data visible to everyone read-only with write permissions limited to maintainers.

You could host CKAN inside your firewall (that's what we do with our highly sensitive, non-public datasets). For this to happen you'll need your IT crew to implement access permissions at the firewall / reverse proxy level. This happens outside of CKAN itself. Currently this is what we do in my agency.

Alternatively, you can limit access to authorised users only - there are many ways to do so, but unless they support "headless authentication" this will break access to the CKAN API (that's the machine-readable data format that scripts and other software uses to talk to CKAN - e.g. visualisations). Customising authentication can happen either outside of CKAN (again a proxy / firewall issue handled by IT) or through CKAN extensions like https://github.com/NaturalHistoryMuseum/ckanext-ldap (which only limits write access, not read access - your data would still be publicly visible). We tried that, but could not find a way to both limit access to CKAN and not break access to our CKAN API - however, this might be a limitation of how we implemented the user single sign on (we don't support "headless authentication"), your mileage may vary.

re data viz
CKAN's basic visualisations are great for what they do but they are of course limited to the basic use cases. If you need shinier visualisations, your options are:
- have an analyst write a visualisation using live data from your CKAN, e.g. as an RShiny app (https://shiny.rstudio.com/). This can be done very easily (matter of hours) and is very flexible. We do this a lot. There are fantastic integrations with R, Python, you can also use plain SQL or Javascript - anything an analyst would need is there. Just point your data nerds at the CKAN API and off you go.

- (pay a developer to) develop a CKAN plugin providing the visualisations you need. This is a bit more effort (matter of days to weeks) and less flexible, but more integrated. E.g. there is a CKAN extension to use the absolutely gorgeous mapping software behind https://nationalmap.gov.au/ (TerriaJS) as previewer for spatial data.

- use a third party data viz app (e.g. https://thenextweb.com/dd/2015/04/21/the-14-best-data-visualization-tools) - e.g. Tableau. This is where CKAN's API shines - anything you can access and do as a human by clicking buttons or browsing CKAN, software can do too through the CKAN API.

Overall caveat: clean data in, good viz out. Make sure your data is in standard formats (CSV instead of XLS or worse, geojson for spatial data instead of shape files and so on), passes the QA of http://goodtables.okfnlabs.org/ to save your analysts lots of data cleaning time.

re commercial support
There are third parties offering any level of integration / customisation: https://ckan.org/commercial/
Of course, CKAN has a strong user and developer community (see https://stackoverflow.com/questions/tagged/ckan).


IMHO your crux will be to limit read access to CKAN to logged in users. Experienced third party support providers (see link above) would be best suited to provide options.
I'd be keen to hear from others on this list about viable options of limiting CKAN read access to logged in users only without breaking the API.

Hope that helps!
Florian



--
You received this message because you are subscribed to the Google Groups "CKAN Global User Group (Non-technical questions)" group.
To unsubscribe from this group and stop receiving emails from it, send an email to ckan-global-user-group+unsub...@googlegroups.com.
To post to this group, send email to ckan-global-user-group@googlegroups.com.
Visit this group at https://groups.google.com/group/ckan-global-user-group.
To view this discussion on the web, visit https://groups.google.com/d/msgid/ckan-global-user-group/ca658f02-c8ad-40d3-a24d-0e9c8cc60de9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

David Read

unread,
Sep 6, 2017, 4:47:33 AM9/6/17
to ckan-global...@googlegroups.com
Hi Rae,
It sounds like some of the commercial CKAN members can chip in here to
provide some answers. What region are you in?
David
> --
> You received this message because you are subscribed to the Google Groups
> "CKAN Global User Group (Non-technical questions)" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to ckan-global-user-...@googlegroups.com.
> To post to this group, send email to
> ckan-global...@googlegroups.com.

David Read

unread,
Sep 6, 2017, 4:49:17 AM9/6/17
to ckan-global...@googlegroups.com
Ah, I just saw Florian's very helpful response. Do ask if there's any more.
Regards,
David
Reply all
Reply to author
Forward
0 new messages