Re: Getting library resources into search engines

2 views
Skip to first unread message

Casey Bisson

unread,
Feb 11, 2008, 11:24:27 AM2/11/08
to scri...@googlegroups.com

In a Feb 10, 2008 message to the web4lib list, Edward Spodick wrote
some interesting stuff about the importance of a good html title tag
and mentioned Scriblio as application that does well in that respect
(Yay!). I'm copying my response here, as it includes some Scriblio
specific information:

> [...]as someone else alluded to, many ILS systems, including both
> the III and VuFind implementations, suffer from using a generic
> TITLE tag in the HEAD of the html - to the title tag, which is used
> for displaying Google search results, will just be something like
> the following for every record.
> "Hong Kong University of Science and Technology"
> "Library Resource Finder: Record Holdings"
> Not very useful when what the user would want would probably be the
> title of the item.
>
> The Scriblio implementation does a better job on this aspect, at
> least in our implementation, with things like
> "HKUST Library Catalog » Japanese popular music : culture,
> authenticity, and power"
> as http://catalog.ust.hk/catalog/archives/710731

Not only that, but you can change how it's represented in the theme.
Some WordPress users have invested serious time into thinking about
how those things should work; the default Scriblio theme follows the
conventions set by other WordPress default themes, but there's no
reason you can't make changes.


> So while I do plan to explore the Sitemap method of exposing these
> permanent links, until [the page title metadata] is fixed the
> results in search engines may not appear too useful.

You can also use the WordPress sitemap plugin to do that. I've not
tried it yet (and you might find it's not coded to handle nearly 1
million records your collection has), but here's a link:

http://wordpress.org/extend/plugins/google-sitemap-generator/

Spode

unread,
Feb 25, 2008, 2:42:30 AM2/25/08
to scri...@googlegroups.com, Casey Bisson
At 11:24 AM -0500 2/11/08, Casey Bisson wrote:
In a Feb 10, 2008 message to the web4lib list, Edward Spodick wrote
> So while I do plan to explore the Sitemap method of exposing these
> permanent links, until [the page title metadata] is fixed the
> results in search engines may not appear too useful.

You can also use the WordPress sitemap plugin to do that. I've not
tried it yet (and you might find it's not coded to handle nearly 1
million records your collection has), but here's a link:

http://wordpress.org/extend/plugins/google-sitemap-generator/

Currently, it seems that that plugin will only generate a single sitemap file for an entire Wordpress installation.  But the sitemap specification requires that there be less than 50,000 entries per sitemap file.  I am waiting for the developer of that plugin to respond regarding the possible generation of the appropriate sitemap index file with multiple sitemap files with <50,000 links each.  That would also be much less of a memory burden - apparently a number of users of that plugin run into RAM exhaustion issues already.

I will post here if I learn anything more on this.

-Edward

Reply all
Reply to author
Forward
0 new messages