Getting masterlist of PURLs from PURLZ

30 views
Skip to first unread message

Chris J

unread,
May 24, 2011, 11:40:16 AM5/24/11
to persistenturls
Montana State Library has nearly 5300 PURLZ on purl.org.

Thanks to Jeff Young's direction, I used the following vb script to
pull a masterlist of all our PURLs. I did get an xml response, which
was delightful. However, it contained only 100 items out of 5300. How
do I use the code to get all the PURLs we have at IA? Is this a coding
issue on my end, a lack of documentation, or an issue with the default
limits of the masterlist response?

Thanks,

Here's the vb that returned just 100 items.

'============================================================
'GET the master list
'============================================================
sender.open "Get", "http://purl.org/admin/purl/?
maintainers=maintainer1, maintainer2, maintainer3"
sender.send
'WScript.echo sender.responseText


Here's more vb for context:

'===========================================================
'Get a session cookie
'===========================================================

Set sender = CreateObject("MSXML2.ServerXMLHTTP")
sender.open "GET", "http://purl.org/admin/loginstatus"
sender.send
'WScript.echo sender.responseText
cookie = sender.getResponseHeader("Set-Cookie")

Set sender = Nothing

'=============================================================
'POST the same Session cookie back to login and keep the session open
'=============================================================

Set sender = CreateObject("MSXML2.ServerXMLHTTP")

sender.open "POST", "http://purl.org/admin/login/login-submit.bsh?
id=xxxxxxxx&passwd=yyyyyyyyy&"
sender.setRequestHeader "SET-COOKIE", cookie
sender.send

'============================================================
'GET the master list
'============================================================
sender.open "Get", "http://purl.org/admin/purl/?
maintainers=maintainer1, maintainer2, maintainer3"
sender.send
'WScript.echo sender.responseText

'=============================================================
'Record the response
'=============================================================
prehtml = sender.responseText
xmlmasterlistresponse.WriteLine prehtml + Chr(13)

Chris J

unread,
May 25, 2011, 10:27:41 AM5/25/11
to persistenturls
Message contains a typo:

This sentence:

>How do I use the code to get all the PURLs we have at IA?

should say:

>How do I use the code to get all the PURLs we have at purl.org?

Chris

Young,Jeff (OR)

unread,
Jun 8, 2011, 11:05:59 AM6/8/11
to persist...@googlegroups.com
Chris,

My guess is that there no way around this limit through the web service. Send me some clues and I'll try to query the database directly.

Jeff

Chris J

unread,
Jun 9, 2011, 3:42:41 PM6/9/11
to persistenturls
Jeff,

As you know I am trying to figure out how to get a masterlist of purls
that we have at purl.org. I have more than one motivation for
requesting this list, but the primary motivation is to know what
support is available if we actually had to retarget all the PURLS and
to make sure we understand the batch process that would be needed to
do this. So, I am not at this point thinking of regularly requesting
the list. Down the line, however, it would be nice if the web service
was fully responsive. Or perhaps there is some way to address in a
loop and pull all the purls in groups of 100.

Thanks for your help,

Identifying info for our PURLs:

1. Our purlid domain is msl. There are also five items under /msl/
test/

2. Maintainer login is jollyrog. (All or almost all PURLS have been
added with this maintainer id.) The other maintainer is lara05. (I
may have added 1 or 2 purls with this id if at all. Of course, there
is also an admin login, but I don’t think I used that for adding
PURLs.

3. Group id is mslfix.

4. I use a database table here, which is keyed with an OCLC number, to
ensure we have only one PURL per OCLC number. Monographs only need one
PURL. We use a query target to group our serials by oclc number. So,
each serial only needs one PURL. Examples are below. The total PURLS
per our list is 5268.

monograph
<purl status="1">
<id>/msl/0A766170-480D-4097-BA3D-
EFEA2113A503</id>
<type>302</type>
<maintainers><uid>jollyrog</
uid><gid>mslfix</gid></maintainers>
<target><url>http://www.archive.org/
details/0A766170-480D-4097-BA3D-EFEA2113A503</url></target>
</purl>

serial
<purl status="1">
<id>/msl/3F90D3D5-CE09-4004-
A51D-5FE1BBF81AE9</id>
<type>302</type>
<maintainers><uid>jollyrog</
uid><gid>mslfix</gid></maintainers>
<target><url>http://www.archive.org/
search.php?query=collection:MontanaStateLibrary AND oclc-id:46875814</
url></target>
</purl>
> > > xmlmasterlistresponse.WriteLine prehtml + Chr(13)- Hide quoted text -
>
> - Show quoted text -
Reply all
Reply to author
Forward
0 new messages