About the Greenstone OAI-PMH interface

82 views
Skip to first unread message

Marcin Werla (PSNC)

unread,
Apr 2, 2012, 10:38:11 AM4/2/12
to acce...@googlegroups.com
Hi!

During the training in Veria we were discussing the possibilities of cooperation between Greenstone (version 2) and Europeana. As some of you may remember, there were two issues detected while testing files exported from Greenstone used by the National and University Library of the Republic of Srpska (http://digital.nub.rs/):
  1. Lack of metadata elements required by Europeana.
  2. Lack of URL which could be used to send the user from Europeana to a digital library.
Issue (1) is something which can be resolved by updating metadata records with all necessary metadata elements and is not related with Greenstone as a digital library software. Issue (2) is more serious as it is strictly connected with Greenstone and if we will not be able to fix it, the digital library probably will have to be moved to other software system.

Let's take a look at one particular book, for example:


In the metadata export from Greenstone we find the following:

<Metadata name="dc.Title">Sabrana djela I</Metadata>
<Metadata name="dc.Date^issued">1973</Metadata>
<Metadata name="dc.Creator">Ivan Franjo Jukić</Metadata>
<Metadata name="dc.Relation^hasFormat">http://digitalna.nubrs.rs.ba/pdf/ifj/ifj1.pdf</Metadata>

So there is a link, but the link is pointing directly to a PDF file. This may be seen as a problem because if you will sent this to Europeana, the user will be redirected from book description in Europeana directly to a PDF file, so he/she will have no chance to see the website in your digital library and for example use navigation on book chapters as visible here:


The I tried to use the OAI-PMH interface of this Greenstone instance which is available here:

Unfotrunately it seems that entire set representing books (књиге / knjige) is not working properly or is empty e.g.:

So I decided to switch to periodicals (часописи). I checked this one:

Босанска вила, Годиште 7, Број 29


In OAI-PMH it looks like this:


As you can see, the identifier of this particular issue is "period:HASH013d0717a67734dc2d40c4f4:1" and it is used by Greenstone to create following link:


Unfortunately, the link is not working for some reason (the result is 404 Not found). The first obvious mistake here is that the link starts with


and not with


and I guess this can be fixed in Greenstone. But still the fixed link is not working properly:


If you will comapre it with a link generated by Greenstone WWW application:


you can see that what makes it work is the following additional part:

e=d-01000-00---off-0period--00-1----0-10-0---0---0direct-10---4-------0-1l--11-sc-50---20-home---00-3-1-00-0-0-11-1-0utfZz-8-00

At the moment I do not know exactly its meaning, but it seems that Greenstone can be configured to work without it. I have found for example this Greenstone installation:


It has OAI-PMH interface (although also not fully functional):

and the short links are working somehow:


So to summarize - it seems that you should be able to generate a URL suitable for Europeana on the basis of identifier returned via Greenstone OAI-PMH interface, but:
1) You must do something to have all your collections visible via the OAI-PMH
2) You must do something to have the short version of links (without part starting with e=..) working properly

At the moment I am not able to tell you what this "something" is. As those of you who use Greenstone for some time know more about Greenstone then me, I guess you may be able to find the solution. If not, let me know and I will investigate this further.

-- 
Best regards,
Marcin Werla

Dalibor Pancic

unread,
Apr 3, 2012, 3:48:55 AM4/3/12
to acce...@googlegroups.com
Hi

I researched a bit about Greenstone and I found the following:

The imported documents

In order to identify documents internally, a unique object identifier or OID is assigned to each original source document when it is imported (formed by hashing the content, to overcome file duplication effects caused by
mirroring) and stored as metadata within that document. It is important that OIDs persist throughout the indexbuilding process—so that a user’s search history is unaffected by rebuilding the collection. OIDs are assigned
by hashing the contents of the original source document.
Once imported, each document is stored in its own subdirectory of  archives, along with any associated files—for example, images. To ensure compatibility with Windows 3.0, only eight characters are used in directory
and file names, which causes annoying but essentially trivial complications.

In our case:



 "To ensure compatibility with Windows 3.0, only eight characters are used in directory and file names, which causes annoying but essentially trivial complications."   -----> HASHb878

Best Regards

Dalibor Pancic



Message has been deleted

Marcin Werla (PSNC)

unread,
Apr 4, 2012, 3:16:15 AM4/4/12
to acce...@googlegroups.com
Hi!

Thanks for this additional info. I understand that we could try to figure out somehow, how to get the URL like the one you wrote on the basis of HASH identifier taken from the OAI-PMH response. But I guess this will be not enough for you, because you will be able to redirect users from Europeana to particular file:


or list of files:


which is not really user friendly and is not promoting your digital library.

And the remark about Windows 3.0 compatibility.. Well... I understand that for some use cases this may be still important, but I guess you should consider moving your DL to a system which is not limiting itself because of such compatibility requirements.

Best,
Marcin

Dalibor Pancic

unread,
Jun 9, 2012, 6:25:17 AM6/9/12
to acce...@googlegroups.com
Hi,

I solved the OAI-PMH problem with the upgrading Greenstone to version 2.85 and with making a few changes in classifier AllList.pm which enabled the extraction of metadata from an articles of journals collection.

After that, I mapped metadata to ESE by using Mint.

You can check how it looks now on http://digital.nub.rs and http://digital.nub.rs/greenstone/cgi-bin/oaiserver.cgi?verb=ListRecords&metadataPrefix=oai_dc&set=knjige

Regards,

Dalibor Pancic
 


Reply all
Reply to author
Forward
0 new messages