PBCore to _____ = ugh

81 views
Skip to first unread message

john

unread,
Nov 13, 2012, 2:45:45 PM11/13/12
to pbcor...@googlegroups.com
Has anyone else run into this issue?

We are often required to share our metadata/data with institutions that have either funded digitization projects or have been partnering institutions in digitization projects.  These major institutions cannot or will not accept our PBCore XML records.  They always seem to ask for a .csv or a spreadsheet of our data because it would be "easier to work with" in Excel.  I try to explain the worksheet size limits of Excel, the PBcore asset/instantiation relationship, the multiple repeatable elements and attributes, the overall complexity of the schema, etc. but people don't really want to hear that (and I guess I don't blame them). 

How does one handle these requests?  Is there an easy way to parse each element/attribute in a PBCore XML record (and by record I guess I mean instantiation) into a row in Excel?  I'm sort of loath to do this, but I understand it's what people want: a flat spreadsheet of our database. 

Maybe it's because I'm not a programmer or maybe it's how the back end of our database is structured (https://github.com/mlc/wnetpbcore), but I'm just not sure how to handle these requests.

Thanks! 

John Passmore
WNYC

Allison Smith

unread,
Nov 14, 2012, 9:58:48 AM11/14/12
to pbcor...@googlegroups.com
Hi John -

What database is working on the back end?   SQLlite?  MySQL?

You might try linking to your database with Access, and see if you can pull in the necessary tables/fields...?  I know sometimes Access is fussy with Open Source db systems, but I like using Access for linking and making smaller tables and spreadsheets, that I can then easily pull into Excel.

Allison

Peter Karman

unread,
Nov 19, 2012, 2:54:47 PM11/19/12
to pbcor...@googlegroups.com
a RDBMS is a 3-dimensional storage system. Data is normalized across
multiple tables.

A CSV (or Excel) doc is a 2-dimensional storage system. Data is stored
in rows and columns.

So no, you're not crazy. I get these kinds of requests all the time when
people want to use data in a tool they know well, even if that tool
can't represent the data in its original relationships. Square peg,
round hole.

One approach is to create a .xls doc with each db table == one
worksheet, which preserves the 3-dimensional aspect of the RDBMS.


--
Peter Karman . 651.228.4972
Director, Software Engineering
Minnesota Public Radio | American Public Media

Reply all
Reply to author
Forward
0 new messages