John Salter [University of Leeds]
unread,Sep 9, 2009, 11:57:46 AM9/9/09Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to juice-project-discuss
Just thought a quick 'Hello' would be in order - now I've started
playing with this.
We're a III Millennium site which presents a whole host of horrible
data to try and scrape for ISBNs, Titles and Authors. The main issue
is that the metadef jquery definition doesn't allow us to use jQuery
functions on the selected nodes.
E.g. (don't blame me for this code - it's how it comes out of the
catalogue!)
-----------------------------------
<tr>
<td width="20%" valign="top" class="bibInfoLabel">Title</td>
<td class="bibInfoData">
<strong>JavaScript : <font color="RED"><strong>the</strong></font>
<font color="RED"><strong>definitive</strong></font> guide / David
Flanagan.</strong></td></tr>
-----------------------------------
Using the normal Juice metadef style:
juice.addMeta(new JuiceMeta("isbns", "td.bibInfoLabel:contains
('Title') + td *"));
->returns the <strong> (but not <strong>+<font>) elements first,
followed by the <strong><font><strong> elements. e.g [JavaScript :
guide / David Flanagan.][the][definitive]
juice.addMeta(new JuiceMeta("isbns", "td.bibInfoLabel:contains
('Title') + td strong"));
->returns the same as above
We can't pass this to a cleaning function, as we have no way of
knowing which elements came first in the screen order. I couldn't
think of any other way to access the data using just selectors - the
nested children are quite problematic here!
To fix this, I just need to use the jQuery.text() function on the td
var metaTitle = $("td.bibInfoLabel:contains('Title') + td").text().trim
();
I've come up with a solution to get around this:
In the main file where you define which extensions you will use:
-make a hidden div (<div id="juiceMetadataElements"
style="display:none;">)
-make another div (or span) element in here for each Meta item (ISBNs,
Title, Author)
Fill each of these with the data as needed. At this point you can use
the whole of JQuery and JavaScript to mangle the data however you
want.
In the metadef file you can then reference the newly created elements
easily to define the data.
I needed this approach because in the default III setup the Title
field can have search terms highlighted in it. This in turn leads to
nested elements that aren't returned in screen-order by the default
metadef definition process.
Hopefully I'll have this available on an open server sometime - I'll
continue working on a III metadef and post my results here/add them to
the cade base.
If you want more info on how far I've got, just ask!