HELP: Where can I find definitions for PBCore Controlled Vocabulary Terms?

265 views
Skip to first unread message

Dianne Kennedy

unread,
Oct 3, 2012, 2:17:26 PM10/3/12
to pbcor...@googlegroups.com
We are defining a metadata set to describe short video clips that are used to enhance publications presented on tablets and eReaders.  We are evaluating the appliciability of PBCore and its CVs.  I find the CV terms and URIs but cannot locate definitions for the CV terms.  Can someone point me in the right direction??
Thanks

Dave MacCarn

unread,
Oct 3, 2012, 2:38:52 PM10/3/12
to pbcor...@googlegroups.com
Most of what's available is at http://www.pbcore.org/documentation/
Looking at the older versions of PBCore on that page can also help. e.g. http://pbcore.org/v1/PBCore/UserGuide.html
The elements on those pages have examples. see http://pbcore.org/v1/PBCore/subject.html

PBCore was made to be very flexible hence the use of the attribute:refer, so say pbcoreSubject could use the an attribute that points to LC Subject Headings.
http://authorities.loc.gov/cgi-bin/Pwebrecon.cgi?DB=local&PAGE=First
or Medical Subject Headings of the National Library of Medicine http://www.nlm.nih.gov/mesh/meshhome.html

Cheers,
Dave
=

Dianne Kennedy

unread,
Oct 3, 2012, 3:14:24 PM10/3/12
to pbcor...@googlegroups.com
Thanks Dave,
in particular I am looking for the definitions of terms in controlled vocabularies, so for example, in the pbcoreAssetType vocabulary what is the definiton for a "clip" vs a "media object" vs "excerpt"?  What is an "item" and how is that different from a "shot".  Our goal is to create a subset of the CVs that apply to the short videos used to enhance publishers tablet editions (most specifically for tablet magazines).

john passmore

unread,
Oct 3, 2012, 4:35:29 PM10/3/12
to pbcor...@googlegroups.com
Hi Dianne,

As far as I know there are no particular pbcore definitions for these words.  I think it can get confusing at times.  For example, it looks like "boat" is still listed as a pbcoreGenre: http://metadataregistry.org/concept/list/vocabulary_id/148.html

This might be helpful for generations though:  http://americanarchiveinventory.org/project/glossary/generations/

John

Dianne Kennedy

unread,
Oct 3, 2012, 5:33:41 PM10/3/12
to pbcor...@googlegroups.com

If PRISM is to build on PB Core, we will require definitions for these terms.  When our specifications are used internationally it is critical to have clear, precise definitions for all metadata fields and all controlled vocabulary terms.  Perhaps we can craft definitions and submit those back to PBCore.org!

 

Dianne Kennedy
VP Emerging Technologies

IDEAlliance

dken...@idealliance.org
office: 630-941-8197
cell: 630-908-0770
fax: 630-941-8196

john passmore

unread,
Oct 4, 2012, 10:33:25 AM10/4/12
to pbcor...@googlegroups.com
Again, anyone on the list, please correct me if I'm wrong! But...

From what I understand, there is no longer a committee or board that maintains or creates guidelines, definitions, updated documentation, etc, for PBCore.  

PREMIS, for example, has an editorial committee, whereas PBCore issues just kind of get hashed out on this listserv.    

I supposed we could as a group collectively craft definitions, cleanup the controlled vocabularies, and better define the use of certain elements. And then add that info to the metadata registry! 

J       

Dianne Kennedy

unread,
Oct 4, 2012, 10:37:25 AM10/4/12
to pbcor...@googlegroups.com

Thanks so much for your input.  Our group is currently reviewing the elements and CVs for application in our limited domain and will gladly share our work back with this group.  Anyone on this list that is interested in participating can contact me to get involved in the work.

Jack Brighton

unread,
Oct 4, 2012, 12:13:41 PM10/4/12
to pbcor...@googlegroups.com
John, you are of course correct in saying there is no longer a PBCore committee or board. With the exception of the wonderful people associated with the American Archive Project, my impression is there's no one remaining at the Corporation for Public Broadcasting that even knows what PBCore is about. Let alone that it needs further work on the controlled vocabularies and definitions, among other things. 

In that vacuum, some of us could step in, and I'd be glad to help in whatever way I can. I do think PBCore is a useful metadata potential standard but it's not there yet. Lots of people are using PBCore in their own A/V collections and applications, but this results in many different forks of the schema because people need to define specifics that aren't yet in the schema. 

I think we could do a few useful things:
  • Clean up the controlled vocabularies ("boat" is a genre, really?)
  • Provide actual definitions of the CV terms, and provide an online authority resource (Metadata Registry is lacking in important ways, or at least it's underdeveloped)
  • Clarify the data types in the XSD. Too many values can be just strings, which is fine for humans maybe but doesn't work for machines. Date and time values can be expressed in different formats/ISO standards, which can also be problematic for data exchange between systems. 
I know that the core developers at WGBH are aware of these things, but there's no further CPB funding to take it further. I was on the PBCore 2.0 schema review team, and we were painfully aware the work wasn't yet done. And I set up http://pbcoreresources.org to (hopefully) facilitate collaboration in the PBCore community on this stuff and whatever else. (With apologies, I have not been as active in the role of facilitator as I should have been lately!)

Sorry for the long-windedness here. I do believe cleaning up PBCore would be worth the effort, and would be glad to contribute in any useful way.

Best regards,

Jack Brighton

Kim Peach

unread,
Oct 4, 2012, 5:21:49 PM10/4/12
to pbcor...@googlegroups.com
...and perhaps now is the time to seriously consider renewing efforts since 

1.the CPB has announced the next phase of the American Archive Inventory project (digitization) and more than a hundred participating public broadcasting stations, many who will be creating the first catalog of their collections, will be looking for a PBCore controlled vocabulary resource.

and

2. looking at the the names on the new CPB AAI Advisory Panel leads me to believe resources might be made available for this if proposed.  See list of members below.

I've heard that CPB has taken "ownership"/responsibility for the PBCore standard now, however they will not continue to be involved wit the American Archive Project.  I suggest Jack's PBCore Resources http://www.pbcoreresources.org website indeed be used as at least an informal "gathering place" to post the definitions we come up with in our own institutions and let the community comment/edit them as we sit fit (ala wikipedia).  References to vocabularies used by AES, IASA, Sound Directions, etc... should be included as applicable.  For a start anyway...  

Thanks,
Kimberly Peach
Former, American Archive Inventory Project Archivist, WXPN Public Radio

The Corporation for Public Broadcasting Convenes Advisory Panel to Guide Development of the American Archive
http://www.cpb.org/pressroom/release.php?prn=985 

American Archive national advisory panel members include: Henry Becton, vice chair and former president for the board of trustees of the WGBH Educational Foundation; Ken Burns, filmmaker, The Civil War, Baseball, Prohibition and the upcoming The Dust Bowl; John W. Carlin, former Governor of Kansas and archivist of the United States, and currently visiting professor, executive-in-residence in the School of Leadership Studies at Kansas State University; Dr. Jeffrey Cole, director of the Center for the Digital Future at University of Southern California’s Annenberg School for Communication and Journalism; Deanna Marcum, managing director at Ithaka S+R, a not-for-profit research and consulting organization, and former associate librarian of Congress; John Ptak, film producer and former talent agent at CAA, William Morris and ICM, and member of the National Film Preservation Board and the National Film Preservation Foundation; Bruce Ramer, partner at Gang, Tyre, Ramer & Brown, a Los Angeles entertainment and media law firm, and member of the CPB board of directors; Cokie Roberts, commentator for ABC News and contributor to NPR’s Morning Edition; Dr. Stephen D. Smith, executive director of the University of Southern California Shoah Foundation – The Institute for Visual History and Education; Hon. Margaret Spellings, president and CEO of Margaret Spellings and Company, and former U.S. Secretary of Education from 2005 to 2009; Sir Howard Stringer, chairman of the board of directors, Sony Corporation; andJesús Salvador Treviño, writer, director and producer whose public television credits include America Tropical, Yo Soy Chicano, La Raza Unida, Chicano Moratorium, The Salazar Inquest, Birthwrite and the PBS documentary series, CHICANO! History of the Mexican American Civil Rights Movement.

Marcos Sueiro Bal

unread,
Oct 5, 2012, 11:38:39 AM10/5/12
to pbcor...@googlegroups.com
I hope Kim is correct with regards to PBCore securing more development resources. The cleaning and maintenance of PBCore's large and unwieldy set of CVs alone would be a big task indeed. And let us not forget that as far as I know, PBCore 2.0 is not yet fully implemented as a schema anywhere on the planet.

Another question is how a newly-formed board would implement or even enforce any changes. For example, I wonder how many of us would change our vocabularies (never mind retroactively) if the board decided we no longer need descriptionType "clip" since there is already "segment".

Another, albeit more cumbersome, approach is to create a set of "best practices" in "rogue" sites such as Jack's very useful pbcoreresources.org, and hope most users will adhere. At this point there is no real connection between the official site and Jack's.

In my view the American Archive Project, with its gathering of metadata from such a diverse range of sources, would have been the perfect catalyst to create a solid schema that fits most of us. If it has not happened yet, the future of PBCore seems dubious at best.

Cheers,

Marcos
--
Marcos Sueiro Bal
Audio Engineer
718.902.7441

Jack Brighton

unread,
Oct 5, 2012, 4:21:15 PM10/5/12
to pbcor...@googlegroups.com
I hope that Kim's rumor about CPB taking ownership of PBCore is true, and I think I'll see what I can find out there.

Meanwhile, there's a ton of other PBCore development taking place in other realms. I don't know who all is on this pbcore-talk list, if it's exclusively public media folks, etc. So I'll mention a few things outside of public media that might add to the picture of what's happening.
  • From the "Knight Foundation News Challenge: Data" awards announcement:

    "Pop Up Archive: Taking multimedia content – including audio, pictures and more – from the shelf to the Web, so that it can be searchable, reusable and shareable. Founded by University of California grad students and SoundCloud Fellows, the project beta-tested by helping archive the collection of the independent, Peabody-winning production team the Kitchen Sisters."

    This project uses open source, well-adopted components including Omeka and WordPress, and the Internet Archive as the primary audio file/metadata repository. They've already built a PBCore metadata plugin for Omeka. As always, until I see the system details, output, etc., I wonder about the implementation, but the project documentation so far looks promising.

    Links:
    1. Knight News Challenge: Data award announcements: http://www.knightfoundation.org/press-room/press-release/six-ventures-bring-data-public-winners-knight-news/
    2. Pop Up Radio Archive description on the UC Berkeley iSchool site: http://www.ischool.berkeley.edu/programs/masters/projects/2012/popuparchive?utm_source=Online+News+Association+List&utm_campaign=1a8013df41-News_Challenge_Data_winners_at_ONA129_20_2012&utm_medium=email
    3. Pop Up Archive site: http://popuparchive.org/

  • Northeast Historic Film, the main repository of historical, documentary, news, and other archival moving image collections covering the New England region, migrated their collections data from an ancient ProCite database to CollectiveAccess using PBCore as a metadata standard. In a project funded by the Council on Library and Information Resources, NHF got funding to develop a PBCore import module for CollectiveAccess, an open source collections management system. The metadata was then exported from CollectiveAccess to a new Drupal website providing front-end access to the NHF collections. See: http://www.oldfilm.org/
  • The Open Media Foundation has developed a Drupal-based platform allowing Public Access TV stations to publish, schedule, and share video content. PBCore metadata is at the core of its controlled vocabularies, and they wrote a Drupal PBCore module to integrate with the system. See: http://openmediafoundation.org/, and http://drupal.org/project/pbcore 
  • Other film archives and institutions adopting or considering adopting PBCore: Chicago Film Archive, Pacific Film Archive, Washington University, the University of Illinois, and many other members of the Association of Moving Image Archivists (AMIA) where PBCore has been the topic of presentations and workshops since 2006. 
What I'm saying is PBCore has out of the CPB box  for some time. I've been involved in some of the advocacy and training sessions for example at the AMIA Conference and the Open Video Conference. What I find frustrating is that CPB seems unaware of the adoption rate and impact of PBCore outside the public broadcasting system. The thing is, with impact comes responsibility and it has long seemed to me that CPB wants nothing to do with taking that responsibility.

I would love to be wrong about that!

Sorry if this is a rant, but hopefully there's some useful content here. :)

Jack

Marcos Sueiro

unread,
Oct 5, 2012, 8:26:42 PM10/5/12
to pbcor...@googlegroups.com
This is very helpful indeed. Thanks, Jack.
 
Marcos

Allison Smith

unread,
Oct 7, 2012, 11:18:17 AM10/7/12
to pbcor...@googlegroups.com
http://www.cpb.org/pressroom/release.php?prn=987

Hello -

I am also interested in seeing PBCore strengthened.  My experience with developing a PBCore 2.0 compliant system (still in development), is that element definitions, cataloguing examples, and rules for use really could use some tender, loving attention.  I also found several elements without existing attributes that could use them, as well as elements that could stand to be repeatable, that currently are not.

I'm attaching a press release that came out recently, about the American Archive project. There are a few sentences that mention that CPB is working on developing a new management system for the AACIP data, with AVPS as consultants on the project.  I contacted AACIP staff, and spoke with Caitlin Hammer about it.  While they haven't yet chosen a system, they do intend on upgrading the existing data records to PBCore 2.0, with a more parent (Intellectual content) / child (Instantiations) structure.

Now might be the opportune time to lobby them, to re-open community development of PBCore, as they develop the system that will house the AACIP records, and which might become the management system available to public stations needing one.  

Allison Smith,
Archivist, WI :Public Radio

Bruce Jacobs

unread,
Oct 7, 2012, 12:26:29 PM10/7/12
to <pbcore-talk@googlegroups.com>, pbcor...@googlegroups.com
Great posts, here.

I'd be interested in thoughts as to how practical it is to improve the vocabularies. Seems like we'd all like to see better value lists and more consistent use. But in the real world, is this possible, given the disparate systems, organizations and personnel involved?

We have a dickens of a time with this within our plant just for tape. And with digital files, it's worse, because it takes a high level of skill to analyze a file and fully describe the format, with accuracy, much less use the right terminology to describe it.

I ask this, figuring that understanding the scope of the challenge is necessary to make headway. Wishing CPB would "fix it", isn't enough.

Bruce

Sent from my iPad

On Oct 7, 2012, at 10:18 AM, "Allison Smith" <serend...@uwalumni.com<mailto:serend...@uwalumni.com>> wrote:

http://www.cpb.org/pressroom/release.php?prn=987

Hello -

I am also interested in seeing PBCore strengthened. My experience with developing a PBCore 2.0 compliant system (still in development), is that element definitions, cataloguing examples, and rules for use really could use some tender, loving attention. I also found several elements without existing attributes that could use them, as well as elements that could stand to be repeatable, that currently are not.

I'm attaching a press release that came out recently, about the American Archive project. There are a few sentences that mention that CPB is working on developing a new management system for the AACIP data, with AVPS as consultants on the project. I contacted AACIP staff, and spoke with Caitlin Hammer about it. While they haven't yet chosen a system, they do intend on upgrading the existing data records to PBCore 2.0, with a more parent (Intellectual content) / child (Instantiations) structure.

Now might be the opportune time to lobby them, to re-open community development of PBCore, as they develop the system that will house the AACIP records, and which might become the management system available to public stations needing one.

Allison Smith,
Archivist, WI :Public Radio



On Fri, Oct 5, 2012 at 7:26 PM, Marcos Sueiro <tiere...@gmail.com<mailto:tiere...@gmail.com>> wrote:
This is very helpful indeed. Thanks, Jack.

Marcos

From: Jack Brighton<mailto:jackbr...@gmail.com>
Sent: Friday, October 05, 2012 4:21 PM
To: pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com>
Subject: Re: [PBCore-talk] Re: HELP: Where can I find definitions for PBCore Controlled Vocabulary Terms?

I hope that Kim's rumor about CPB taking ownership of PBCore is true, and I think I'll see what I can find out there.

Meanwhile, there's a ton of other PBCore development taking place in other realms. I don't know who all is on this pbcore-talk list, if it's exclusively public media folks, etc. So I'll mention a few things outside of public media that might add to the picture of what's happening.

*
From the "Knight Foundation News Challenge: Data" awards announcement:

"Pop Up Archive: Taking multimedia content – including audio, pictures and more – from the shelf to the Web, so that it can be searchable, reusable and shareable. Founded by University of California grad students and SoundCloud Fellows, the project beta-tested by helping archive the collection of the independent, Peabody-winning production team the Kitchen Sisters."

This project uses open source, well-adopted components including Omeka and WordPress, and the Internet Archive as the primary audio file/metadata repository. They've already built a PBCore metadata plugin for Omeka. As always, until I see the system details, output, etc., I wonder about the implementation, but the project documentation so far looks promising.

Links:
* Knight News Challenge: Data award announcements: http://www.knightfoundation.org/press-room/press-release/six-ventures-bring-data-public-winners-knight-news/
* Pop Up Radio Archive description on the UC Berkeley iSchool site: http://www.ischool.berkeley.edu/programs/masters/projects/2012/popuparchive?utm_source=Online+News+Association+List&utm_campaign=1a8013df41-News_Challenge_Data_winners_at_ONA129_20_2012&utm_medium=email
* Pop Up Archive site: http://popuparchive.org<http://popuparchive.org/>/

* Northeast Historic Film, the main repository of historical, documentary, news, and other archival moving image collections covering the New England region, migrated their collections data from an ancient ProCite database to CollectiveAccess using PBCore as a metadata standard. In a project funded by the Council on Library and Information Resources, NHF got funding to develop a PBCore import module for CollectiveAccess, an open source collections management system. The metadata was then exported from CollectiveAccess to a new Drupal website providing front-end access to the NHF collections. See: http://www.oldfilm.org/
* The Open Media Foundation has developed a Drupal-based platform allowing Public Access TV stations to publish, schedule, and share video content. PBCore metadata is at the core of its controlled vocabularies, and they wrote a Drupal PBCore module to integrate with the system. See: http://openmediafoundation.org/, and http://drupal.org/project/pbcore
* Other film archives and institutions adopting or considering adopting PBCore: Chicago Film Archive, Pacific Film Archive, Washington University, the University of Illinois, and many other members of the Association of Moving Image Archivists (AMIA) where PBCore has been the topic of presentations and workshops since 2006.

What I'm saying is PBCore has out of the CPB box for some time. I've been involved in some of the advocacy and training sessions for example at the AMIA Conference and the Open Video Conference. What I find frustrating is that CPB seems unaware of the adoption rate and impact of PBCore outside the public broadcasting system. The thing is, with impact comes responsibility and it has long seemed to me that CPB wants nothing to do with taking that responsibility.

I would love to be wrong about that!

Sorry if this is a rant, but hopefully there's some useful content here. :)

Jack


On 10/5/12 10:38 AM, "Marcos Sueiro Bal" <tiere...@gmail.com<mailto:tiere...@gmail.com>> wrote:

I hope Kim is correct with regards to PBCore securing more development resources. The cleaning and maintenance of PBCore's large and unwieldy set of CVs alone would be a big task indeed. And let us not forget that as far as I know, PBCore 2.0 is not yet fully implemented as a schema anywhere on the planet.

Another question is how a newly-formed board would implement or even enforce any changes. For example, I wonder how many of us would change our vocabularies (never mind retroactively) if the board decided we no longer need descriptionType "clip" since there is already "segment".

Another, albeit more cumbersome, approach is to create a set of "best practices" in "rogue" sites such as Jack's very useful pbcoreresources.org<http://pbcoreresources.org>, and hope most users will adhere. At this point there is no real connection between the official site and Jack's.

In my view the American Archive Project, with its gathering of metadata from such a diverse range of sources, would have been the perfect catalyst to create a solid schema that fits most of us. If it has not happened yet, the future of PBCore seems dubious at best.

Cheers,

Marcos

On Thu, Oct 4, 2012 at 5:21 PM, Kim Peach <kimp...@gmail.com<mailto:kimp...@gmail.com>> wrote:
...and perhaps now is the time to seriously consider renewing efforts since

1.the CPB has announced the next phase of the American Archive Inventory project (digitization) and more than a hundred participating public broadcasting stations, many who will be creating the first catalog of their collections, will be looking for a PBCore controlled vocabulary resource.

and

2. looking at the the names on the new CPB AAI Advisory Panel leads me to believe resources might be made available for this if proposed. See list of members below.

I've heard that CPB has taken "ownership"/responsibility for the PBCore standard now, however they will not continue to be involved wit the American Archive Project. I suggest Jack's PBCore Resources http://www.pbcoreresources.org website indeed be used as at least an informal "gathering place" to post the definitions we come up with in our own institutions and let the community comment/edit them as we sit fit (ala wikipedia). References to vocabularies used by AES, IASA, Sound Directions, etc... should be included as applicable. For a start anyway...

Thanks,
Kimberly Peach
Former, American Archive Inventory Project Archivist, WXPN Public Radio

The Corporation for Public Broadcasting Convenes Advisory Panel to Guide Development of the American Archive
http://www.cpb.org/pressroom/release.php?prn=985

American Archive national advisory panel members include: Henry Becton, vice chair and former president for the board of trustees of the WGBH Educational Foundation; Ken Burns, filmmaker, The Civil War, Baseball, Prohibition and the upcoming The Dust Bowl; John W. Carlin, former Governor of Kansas and archivist of the United States, and currently visiting professor, executive-in-residence in the School of Leadership Studies at Kansas State University; Dr. Jeffrey Cole, director of the Center for the Digital Future at University of Southern California’s Annenberg School for Communication and Journalism; Deanna Marcum, managing director at Ithaka S+R, a not-for-profit research and consulting organization, and former associate librarian of Congress; John Ptak, film producer and former talent agent at CAA, William Morris and ICM, and member of the National Film Preservation Board and the National Film Preservation Foundation; Bruce Ramer, partner at Gang, Tyre, Ramer & Brown, a Los Angeles entertainment and media law firm, and member of the CPB board of directors; Cokie Roberts, commentator for ABC News and contributor to NPR’s Morning Edition; Dr. Stephen D. Smith, executive director of the University of Southern California Shoah Foundation – The Institute for Visual History and Education; Hon. Margaret Spellings, president and CEO of Margaret Spellings and Company, and former U.S. Secretary of Education from 2005 to 2009; Sir Howard Stringer, chairman of the board of directors, Sony Corporation; andJesús Salvador Treviño, writer, director and producer whose public television credits include America Tropical, Yo Soy Chicano, La Raza Unida, Chicano Moratorium, The Salazar Inquest, Birthwrite and the PBS documentary series, CHICANO! History of the Mexican American Civil Rights Movement.

On Thu, Oct 4, 2012 at 12:13 PM, Jack Brighton <jackbr...@gmail.com<mailto:jackbr...@gmail.com>> wrote:
John, you are of course correct in saying there is no longer a PBCore committee or board. With the exception of the wonderful people associated with the American Archive Project, my impression is there's no one remaining at the Corporation for Public Broadcasting that even knows what PBCore is about. Let alone that it needs further work on the controlled vocabularies and definitions, among other things.

In that vacuum, some of us could step in, and I'd be glad to help in whatever way I can. I do think PBCore is a useful metadata potential standard but it's not there yet. Lots of people are using PBCore in their own A/V collections and applications, but this results in many different forks of the schema because people need to define specifics that aren't yet in the schema.

I think we could do a few useful things:

* Clean up the controlled vocabularies ("boat" is a genre, really?)
* Provide actual definitions of the CV terms, and provide an online authority resource (Metadata Registry is lacking in important ways, or at least it's underdeveloped)
* Clarify the data types in the XSD. Too many values can be just strings, which is fine for humans maybe but doesn't work for machines. Date and time values can be expressed in different formats/ISO standards, which can also be problematic for data exchange between systems.

I know that the core developers at WGBH are aware of these things, but there's no further CPB funding to take it further. I was on the PBCore 2.0 schema review team, and we were painfully aware the work wasn't yet done. And I set up http://pbcoreresources.org to (hopefully) facilitate collaboration in the PBCore community on this stuff and whatever else. (With apologies, I have not been as active in the role of facilitator as I should have been lately!)

Sorry for the long-windedness here. I do believe cleaning up PBCore would be worth the effort, and would be glad to contribute in any useful way.

Best regards,

Jack Brighton

On 10/4/12 9:33 AM, "john passmore" <jwpas...@gmail.com<mailto:jwpas...@gmail.com>> wrote:

Again, anyone on the list, please correct me if I'm wrong! But...

From what I understand, there is no longer a committee or board that maintains or creates guidelines, definitions, updated documentation, etc, for PBCore.

PREMIS, for example, has an editorial committee, whereas PBCore issues just kind of get hashed out on this listserv.

I supposed we could as a group collectively craft definitions, cleanup the controlled vocabularies, and better define the use of certain elements. And then add that info to the metadata registry!

J

On Wed, Oct 3, 2012 at 5:33 PM, Dianne Kennedy <dken...@idealliance.org<mailto:dken...@idealliance.org>> wrote:
If PRISM is to build on PB Core, we will require definitions for these terms. When our specifications are used internationally it is critical to have clear, precise definitions for all metadata fields and all controlled vocabulary terms. Perhaps we can craft definitions and submit those back to PBCore.org<http://PBCore.org>!

Dianne Kennedy
VP Emerging Technologies

IDEAlliance
dken...@idealliance.org<mailto:dken...@idealliance.org>
office: 630-941-8197<tel:630-941-8197>
cell: 630-908-0770<tel:630-908-0770>
fax: 630-941-8196<tel:630-941-8196>

From: pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com> [mailto:pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com>] On Behalf Of john passmore
Sent: Wednesday, October 03, 2012 3:35 PM
To: pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com>
Subject: Re: [PBCore-talk] Re: HELP: Where can I find definitions for PBCore Controlled Vocabulary Terms?

Hi Dianne,

As far as I know there are no particular pbcore definitions for these words. I think it can get confusing at times. For example, it looks like "boat" is still listed as a pbcoreGenre: http://metadataregistry.org/concept/list/vocabulary_id/148.html

This might be helpful for generations though: http://americanarchiveinventory.org/project/glossary/generations/

John


On Wed, Oct 3, 2012 at 3:14 PM, Dianne Kennedy <dken...@idealliance.org<mailto:dken...@idealliance.org>> wrote:
Thanks Dave,
in particular I am looking for the definitions of terms in controlled vocabularies, so for example, in the pbcoreAssetType vocabulary what is the definiton for a "clip" vs a "media object" vs "excerpt"? What is an "item" and how is that different from a "shot". Our goal is to create a subset of the CVs that apply to the short videos used to enhance publishers tablet editions (most specifically for tablet magazines).


On Wednesday, October 3, 2012 1:17:26 PM UTC-5, Dianne Kennedy wrote:
We are defining a metadata set to describe short video clips that are used to enhance publications presented on tablets and eReaders. We are evaluating the appliciability of PBCore and its CVs. I find the CV terms and URIs but cannot locate definitions for the CV terms. Can someone point me in the right direction??
Thanks






--
Marcos Sueiro Bal
Audio Engineer
718.902.7441<tel:718.902.7441>

Allison Smith

unread,
Oct 8, 2012, 11:14:16 AM10/8/12
to pbcor...@googlegroups.com
Hi Bruce -

I understand your reservations.

I think that it is never easy coming to consensus.  When it comes to vocabularies, the lumpers (of which I am one), will always upset the splitters (and vice versa), and some people will always walk away upset with things.  However, for the good of the whole, it is important to hash these things out, and try to tighten up vocabulary lists and usage rules so that people can move forward with some direction and sense of clarity.

You are right though, about the challenges when it comes to determining vocabularies for digital resources (digital format, standard, encoding) - ugh.  There doesn't appear to be any consistency in how the terms are used, or, logical ability to organize them into a hierarchy.  Also, what happens as the technology changes over time (as it does, and frequently), and new terms need to be added...?   This is a huge challenge - much more so than the other element vocabularies, I think.

But, perhaps in the course of hashing out the issues, it will become clear what can be done.  I am (mostly) optimistic.

Allison

Jack Brighton

unread,
Oct 8, 2012, 12:56:58 PM10/8/12
to pbcor...@googlegroups.com
All good points. I think any use of human language will never be perfect, but we can at least navigate toward common ground. 

We have as a given that many existing systems use different vocabularies, as Bruce points out. The last thing I would do is throw away the proprietary data from those systems, since it reflects the most accurate local knowledge of the content. But if we want interoperability at the programatic and semantic level, then we need a common language and that's what PBCore has sought to provide. Alas, it's not fully cooked but the potential remains sound.

So there's work to do there to tune the PBCore vocabularies, but that's just a starting point. What do we do with the existing data based on other vocabularies?  I have a couple of too-vague ideas about this, and for the sake of vagueness I'm going to stick to Descriptive metadata and ignore the problem of Technical terms. Some hand-waving here, but please stay with me!

PBCore provides a simple way of specifying not just a subject term, but also the subject authority used. We have lots of different subject authorities for our content, for example the NPR API subject list, PBS/Bento/COVE subjects, the older PBS Program Taxonomy, the American Archive (which used a very specific subject list in its Pilot Project), and of course our legacy local vocabularies. In PBCore, for any given content, we can include pbcoreSubject elements using terms from all these vocabularies, along with source attributes specifying which vocabulary goes with each term.

That's a lot of work. But nobody said cataloging is easy, and if we want our content to be meaningful and findable in different taxonomy spaces, we kind of need to do this one way or another. 

But there might be an better way. One of the great things about PBCore 2.0 is the ref attribute, which allows us to specify not only the term in a given vocabulary, but a URI for that term. One of the great things about URIs is they can link to a complete definition. Even better than that, you can map two URIs together and declare them related. 

So if in your vocabulary you have the term "Climate Change," and in my vocabulary it's "Global Warming," we can map these different terms via their URIs to "mean" the same thing. We could resolve them both to another vocabulary, like LCSH, Merlin, or Wikipedia. 

Of course said mapping would have to be done by hand, but it would only have to be done once for each vocabulary. This also assumes that URIs exist for each term in each vocabulary. We'd have to write mapping tools and plugins for a number of different systems and tools used by local collections. Or a cloud service that accepts mapping definitions from our different systems. Quite a bit of work to be done there! 

But the payoff is our metadata can live and breathe programmatically and semantically with other systems and collections, without all of us having to use the same vocabularies. 

I think the low-hanging fruit remains a consolidation of the PBCore terms, especially for elements like pbcoreGenre, pbcoreAssetType, creatorRole, contributorRole, etc. And the technical terms in pbcoreInstantiation need work (although that stuff gives me a headache…).  

And maybe Open Metadata Registry is fine for the term URIs, but currently there are no definitions there for the terms. I'd suggest by writing definitions, it would force us to consider the meaning of each term, and if it needs to be included, or perhaps consolidated with other terms. 

For example, "Boat."


Cheers,
Jack

Anne Wootton

unread,
Oct 10, 2012, 10:06:00 PM10/10/12
to pbcor...@googlegroups.com, bailey....@gmail.com, Bailey Smith

We enjoyed reading through this chain, since many of you have raised issues that we've been struggling with as we develop the Pop Up Archive system for oral history archives and independent producers. Thanks to a grant from the Knight Foundation, we are now working to expand our system with the ultimate goal of systematizing records across public media organizations (we're particularly glad to see some more money going into the CPB's American Archive project). We are currently partnering with the Public Radio Exchange, and we’ll be reaching out to the CPB about our work, since we’d love to contribute meaningfully to this standard.

We've been building out our PBCore plugin for Omeka, and are working to perfect the mapping from our simplified PBCore Omeka Element Set to valid PBCoreXML. We reached out early in our project to Dave Rice at the suggestion of Kris Carpenter at the Internet Archive, and he's been very helpful and forthcoming with feedback. We've been really lucky to have Dave's help, but it's clear that there needs to be some authoritative place to ask questions and troubleshoot problems as they arise. We've already cycled through several iterations of our schema, and we certainly want to be mindful in avoiding the "another fork" scenario. We're hoping that our Omeka to PBCoreXML mapping will provide standardization among those using the plugin. We were informed by WNYC, WGBH, and the California Audiovisual Preservation Project as we built the plug-in, and it would certainly make sense for us to try to synthesize our efforts.

We've also struggled where we've found PBCore lacking in terms of certain guidelines, definitions, and updated documentation. Since we're spending a lot of time thinking about and working with public media and oral history metadata these days, we'd love to be involved with or take a more significant role in terms of helping to manage the great work and thinking being done by everyone on this list.

Where we’d love to help:

  • Collaborate with anyone interested in the creation of definitions for the terms in PBCore's controlled vocabularies, add the definitions to the Metadata Registry, and look into solutions for the ever-present vocabulary problem when it comes to legacy records. A few thoughts (we’re not 100% solid on what this would require from a programming standpoint, but where there’s a will, there’s a way):
    • Leveraging linked data resources could help resolve issues with legacy vocabularies by mapping between concepts like “Climate Change” and “Global Warming” (http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf).
    • Jack pointed out that the Metadata Registry is limited. While it’s true that a one-time mapping by hand would be a lot of work, it could be done from the existing Metadata Registry-maintained PBCore vocabularies to a linked data vocabulary. 
    • That way, legacy vocabularies would not need URIs for each of their terms. Instead, the linked data mapping would use the Metadata Registry URIs and link to other terms through a linked data vocabulary, à la DBPedia (http://dbpedia.org/About). 
  • Clarify the data types in the XSD. Too many values can be just strings, which is fine for humans maybe but doesn't work for machines. Date and time values can be expressed in different formats/ISO standards, which can also be problematic for data exchange between systems.
  • Integrate resources. Perhaps we could add an editable wiki to the PBCore site (pbcore.org) that incorporates content from the PBCore github issues page (https://github.com/WGBH/PBCore2.0/issues,http://pbcoreresources.org), the archives from this Google Group, and Jack’s blog (http://www.pbcoreresources.org). As some of you mentioned, element definitions could definitely use examples (à la W3C). We could all contribute snippets of our schemas and post them to a wiki, and Pop Up Archive would love to attempt to standardize across the snippets.
  • Regarding the Git Hub page, it might be helpful to do something like dot voting, where the issues that get the most dots are those that really need to be ironed out. We’ve noticed on other Git Hub issues pages certain problems will get "+1" posts, so the dot voting could be an improvement on that.

Marcos raises good points about how changes to PBCore would be implemented and enforced (i.e. what if descriptionType “clip” gets nixed in favor of “segment”?). In the interest of having a working schema, we’d be happy to help when it comes to implementation and enforcement, even if it isn’t a pretty process. It seems like having a centralized wiki for pooling the existing disparate PBCore web resources could be a step in this direction.

Glad to see people talking and thinking about this,

Anne & Bailey

popuparchive.org

Allison Smith

unread,
Oct 11, 2012, 10:36:51 AM10/11/12
to pbcor...@googlegroups.com
Hi -  Count me in!  When do we start?  How can I help?

Allison

Jack Brighton

unread,
Oct 11, 2012, 11:47:12 AM10/11/12
to pbcor...@googlegroups.com
Me three! Let's get this train rolling.

Jack

Kim Peach

unread,
Oct 11, 2012, 12:46:14 PM10/11/12
to pbcor...@googlegroups.com
Hi all,

In the interest of clarity, I went ahead and dropped a line to Stephanie Sapienza, American Archive Inventory Project Manager at the CPB, to verify if I was correct in stating in an earlier post here that the CPB had taken over ownership of the PBCore website as well as the Standard and Vocabulary itself.  She confirmed that this is correct and relayed that this "ownership"of PBCore will transfer along with the American Archive when a new home is found for it in the Spring of 2013.

Stephanie also expressed her support of PBCore and it's future development.

So, my take is that any work this group does between now and next year will be a great help in keeping the standard moving in the right direction!

Kim

Thom

unread,
Oct 11, 2012, 12:56:27 PM10/11/12
to pbcor...@googlegroups.com
I'd also point out or remind everyone that many of the controlled
vocabularies which might seem redundant or too vague for broadcasters
are very helpful for libraries, archives, and musuems which house
collections with public media objects, but are not their primary
focus. Both LCSH and LCGFT are controlled vocabularies with an active
in-house editorial board which adds or modifies radio and television
form/genre terms, as well as the use of these terms as topical terms,
for those institutions who need them. Also, many of the entities and
their relationships in the library/archive world's new cataloging code
RDA (Resource Description and Access) have registered URI's on the
Open Metadata Registry. Genre and form are attributes of works in this
code, although subject concepts (such as concept, topic, place,
temporal period) are not yet written into the code.

Best,
Thom

Thomas Pease
(not speaking for my employer)
Library of Congress
Packard Campus for Audiovisual Conservation
Recorded Sound Cataloger
tpe...@loc.gov

Yvonne Ng

unread,
Oct 11, 2012, 1:06:33 PM10/11/12
to pbcor...@googlegroups.com
Hi all

I don't know if this is helpful at all, but there is a Google spreadsheet that some folks on this list contributed to way back when WGBH was seeking feedback on vocabularies for PBCore 2.0. The spreadsheets list all of the vocabulary terms, their source, and some comments from various reviewers. As I remember, there wasn't enough project funding to move beyond this step.

Here is the link again for those who are already shared on the document.

I don't feel at liberty to open the Doc to everyone since there are contributions from various people in it. But perhaps we can open it up if people think it's even useful and those who contributed have no objection.

Best,
Yvonne

On Thu, Oct 11, 2012 at 12:46 PM, Kim Peach <kimp...@gmail.com> wrote:



--
Yvonne Ng
Archivist
yvo...@witness.org

WITNESS
80 Hanson Place, 5th Floor
Brooklyn, NY 11217
Tel. 718-783-2000 x320  ::  Fax. 718-783-1593
http://www.witness.org  ::  http://blog.witness.org


Bruce Jacobs

unread,
Oct 11, 2012, 2:05:22 PM10/11/12
to pbcor...@googlegroups.com
I contributed and have no objection.

thanks for reminding me of this!

From: Yvonne Ng <yvo...@witness.org<mailto:yvo...@witness.org>>
Reply-To: "pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com>" <pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com>>
Date: Thursday, October 11, 2012 12:06 PM
To: "pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com>" <pbcor...@googlegroups.com<mailto:pbcor...@googlegroups.com>>
Subject: Re: [PBCore-talk] Re: HELP: Where can I find definitions for PBCore Controlled Vocabulary Terms?

Hi all

I don't know if this is helpful at all, but there is a Google spreadsheet that some folks on this list contributed to way back when WGBH was seeking feedback on vocabularies for PBCore 2.0. The spreadsheets list all of the vocabulary terms, their source, and some comments from various reviewers. As I remember, there wasn't enough project funding to move beyond this step.

Here is the link again for those who are already shared on the document.
https://docs.google.com/a/witness.org/spreadsheet/ccc?key=0Ai_8ajzhB3_ldE1NVUpLUWJ5UE9GUFItWFFoMkRmX0E#gid=2

I don't feel at liberty to open the Doc to everyone since there are contributions from various people in it. But perhaps we can open it up if people think it's even useful and those who contributed have no objection.

Best,
Yvonne

On Thu, Oct 11, 2012 at 12:46 PM, Kim Peach <kimp...@gmail.com<mailto:kimp...@gmail.com>> wrote:
Hi all,

In the interest of clarity, I went ahead and dropped a line to Stephanie Sapienza, American Archive Inventory Project Manager at the CPB, to verify if I was correct in stating in an earlier post here that the CPB had taken over ownership of the PBCore website as well as the Standard and Vocabulary itself. She confirmed that this is correct and relayed that this "ownership"of PBCore will transfer along with the American Archive when a new home is found for it in the Spring of 2013.

Stephanie also expressed her support of PBCore and it's future development.

So, my take is that any work this group does between now and next year will be a great help in keeping the standard moving in the right direction!

Kim

On Thu, Oct 11, 2012 at 11:47 AM, Jack Brighton <jackbr...@gmail.com<mailto:jackbr...@gmail.com>> wrote:
Me three! Let's get this train rolling.

Jack

On 10/11/12 9:36 AM, "Allison Smith" <serend...@uwalumni.com<mailto:serend...@uwalumni.com>> wrote:

Hi - Count me in! When do we start? How can I help?

Allison

On Wed, Oct 10, 2012 at 9:06 PM, Anne Wootton <anne.w...@gmail.com<mailto:anne.w...@gmail.com>> wrote:

We enjoyed reading through this chain, since many of you have raised issues that we've been struggling with as we develop the Pop Up Archive system for oral history archives and independent producers. Thanks to a grant from the Knight Foundation, we are now working to expand our system with the ultimate goal of systematizing records across public media organizations (we're particularly glad to see some more money going into the CPB's American Archive project). We are currently partnering with the Public Radio Exchange, and we’ll be reaching out to the CPB about our work, since we’d love to contribute meaningfully to this standard.

We've been building out our PBCore plugin for Omeka, and are working to perfect the mapping from our simplified PBCore Omeka Element Set to valid PBCoreXML. We reached out early in our project to Dave Rice at the suggestion of Kris Carpenter at the Internet Archive, and he's been very helpful and forthcoming with feedback. We've been really lucky to have Dave's help, but it's clear that there needs to be some authoritative place to ask questions and troubleshoot problems as they arise. We've already cycled through several iterations of our schema, and we certainly want to be mindful in avoiding the "another fork" scenario. We're hoping that our Omeka to PBCoreXML mapping will provide standardization among those using the plugin. We were informed by WNYC, WGBH, and the California Audiovisual Preservation Project as we built the plug-in, and it would certainly make sense for us to try to synthesize our efforts.

We've also struggled where we've found PBCore lacking in terms of certain guidelines, definitions, and updated documentation. Since we're spending a lot of time thinking about and working with public media and oral history metadata these days, we'd love to be involved with or take a more significant role in terms of helping to manage the great work and thinking being done by everyone on this list.

Where we’d love to help:

* Collaborate with anyone interested in the creation of definitions for the terms in PBCore's controlled vocabularies, add the definitions to the Metadata Registry, and look into solutions for the ever-present vocabulary problem when it comes to legacy records. A few thoughts (we’re not 100% solid on what this would require from a programming standpoint, but where there’s a will, there’s a way):
* Leveraging linked data resources could help resolve issues with legacy vocabularies by mapping between concepts like “Climate Change” and “Global Warming” (http://tomheath.com/papers/bizer-heath-berners-lee-ijswis-linked-data.pdf).
* Jack pointed out that the Metadata Registry is limited. While it’s true that a one-time mapping by hand would be a lot of work, it could be done from the existing Metadata Registry-maintained PBCore vocabularies to a linked data vocabulary.
* That way, legacy vocabularies would not need URIs for each of their terms. Instead, the linked data mapping would use the Metadata Registry URIs and link to other terms through a linked data vocabulary, à la DBPedia (http://dbpedia.org/About).
* Clarify the data types in the XSD. Too many values can be just strings, which is fine for humans maybe but doesn't work for machines. Date and time values can be expressed in different formats/ISO standards, which can also be problematic for data exchange between systems.

* Integrate resources. Perhaps we could add an editable wiki to the PBCore site (pbcore.org<http://pbcore.org>) that incorporates content from the PBCore github issues page (https://github.com/WGBH/PBCore2.0/issues,http://pbcoreresources.org), the archives from this Google Group, and Jack’s blog (http://www.pbcoreresources.org). As some of you mentioned, element definitions could definitely use examples (à la W3C). We could all contribute snippets of our schemas and post them to a wiki, and Pop Up Archive would love to attempt to standardize across the snippets.
* Regarding the Git Hub page, it might be helpful to do something like dot voting, where the issues that get the most dots are those that really need to be ironed out. We’ve noticed on other Git Hub issues pages certain problems will get "+1" posts, so the dot voting could be an improvement on that.

Marcos raises good points about how changes to PBCore would be implemented and enforced (i.e. what if descriptionType “clip” gets nixed in favor of “segment”?). In the interest of having a working schema, we’d be happy to help when it comes to implementation and enforcement, even if it isn’t a pretty process. It seems like having a centralized wiki for pooling the existing disparate PBCore web resources could be a step in this direction.

Glad to see people talking and thinking about this,

Anne & Bailey

popuparchive.org<http://popuparchive.org>





--
Yvonne Ng
Archivist
yvo...@witness.org<mailto:yvo...@witness.org>

WITNESS
80 Hanson Place, 5th Floor
Brooklyn, NY 11217
Tel. 718-783-2000 x320<tel:718-783-2000%20x320> :: Fax. 718-783-1593<tel:718-783-1593>
http://www.witness.org :: http://blog.witness.org


Anne Wootton

unread,
Oct 11, 2012, 4:30:08 PM10/11/12
to pbcor...@googlegroups.com
Allison, Jack, Kim, Thom, Yvonne and Bruce — thanks for the responses.

We'll get in touch with Stephanie et al at the American Archive project and let you know next steps ASAP.

In the meantime, it would be great to get access to the Google spreadsheet of vocabulary terms.

Anne & Bailey

Yvonne Ng

unread,
Oct 11, 2012, 4:41:34 PM10/11/12
to pbcor...@googlegroups.com
Hello again,

I'm guessing that no one is going to object sharing the Google Spreadsheet, but just in case, how about this...  Feel free to email me personally in the next 24 hours if you have any concerns, otherwise I will share the link with the group tomorrow.

Thanks!
Yvonne

Allison Smith

unread,
Oct 12, 2012, 10:53:06 AM10/12/12
to pbcor...@googlegroups.com
Anne and Bailey -

Thanks so much for getting the ball rolling on this. 

I wanted Thom to know that I couldn't agree with him more, on the point he made about looking at existing vocabularies and standards as much as possible, and moving forward using them to the extent they are useful. 

It never makes sense to me, to reinvent the wheel.  Librarians and other information professionals have been tackling these vocabulary and standards issues for decades, and have left us with a very good and strong foundation on which to build.  Heck, I still reference AACR2, and find it invaluable. :)

Allison

Yvonne Ng

unread,
Oct 15, 2012, 12:14:04 PM10/15/12
to pbcor...@googlegroups.com
Hi all

Seeing no objections, here is the link to the Google Doc that the review group for the PBCore 2.0 controlled vocabularies was working off last year.


Best,
Yvonne

Dianne Kennedy

unread,
Oct 15, 2012, 12:35:18 PM10/15/12
to pbcor...@googlegroups.com

All this discussion is welcomed.  We look forward to working with the PBCore community and would certainly like to be involved.

 

Dianne Kennedy
VP Emerging Technologies

IDEAlliance

Reply all
Reply to author
Forward
0 new messages