Hello - from a Innovative site!

7 views
Skip to first unread message

John Salter [University of Leeds]

unread,
Sep 9, 2009, 11:57:46 AM9/9/09
to juice-project-discuss
Just thought a quick 'Hello' would be in order - now I've started
playing with this.

We're a III Millennium site which presents a whole host of horrible
data to try and scrape for ISBNs, Titles and Authors. The main issue
is that the metadef jquery definition doesn't allow us to use jQuery
functions on the selected nodes.

E.g. (don't blame me for this code - it's how it comes out of the
catalogue!)
-----------------------------------
<tr>
<td width="20%" valign="top" class="bibInfoLabel">Title</td>
<td class="bibInfoData">
<strong>JavaScript : <font color="RED"><strong>the</strong></font>
<font color="RED"><strong>definitive</strong></font> guide / David
Flanagan.</strong></td></tr>
-----------------------------------
Using the normal Juice metadef style:
juice.addMeta(new JuiceMeta("isbns", "td.bibInfoLabel:contains
('Title') + td *"));
->returns the <strong> (but not <strong>+<font>) elements first,
followed by the <strong><font><strong> elements. e.g [JavaScript :
guide / David Flanagan.][the][definitive]
juice.addMeta(new JuiceMeta("isbns", "td.bibInfoLabel:contains
('Title') + td strong"));
->returns the same as above
We can't pass this to a cleaning function, as we have no way of
knowing which elements came first in the screen order. I couldn't
think of any other way to access the data using just selectors - the
nested children are quite problematic here!

To fix this, I just need to use the jQuery.text() function on the td
var metaTitle = $("td.bibInfoLabel:contains('Title') + td").text().trim
();

I've come up with a solution to get around this:
In the main file where you define which extensions you will use:
-make a hidden div (<div id="juiceMetadataElements"
style="display:none;">)
-make another div (or span) element in here for each Meta item (ISBNs,
Title, Author)
Fill each of these with the data as needed. At this point you can use
the whole of JQuery and JavaScript to mangle the data however you
want.

In the metadef file you can then reference the newly created elements
easily to define the data.
I needed this approach because in the default III setup the Title
field can have search terms highlighted in it. This in turn leads to
nested elements that aren't returned in screen-order by the default
metadef definition process.

Hopefully I'll have this available on an open server sometime - I'll
continue working on a III metadef and post my results here/add them to
the cade base.

If you want more info on how far I've got, just ask!

Richard Wallis

unread,
Sep 10, 2009, 6:11:00 AM9/10/09
to juice-proj...@googlegroups.com
Hi John,

When designing the addMeta functionality, it was a concern as to how much jQuery selection and parsing functionality to expose.  It is clear that the Millennium interface is stretching it's capability.

Needing more flexibility around capturing and storing meta values is what was behind the changes in version 0.6.  You will find that from this revision juice.findMeta() replaces juice.addMeta()  and there is a new juice.setMeta() that gives you the capability to do as much mangling as you want and then set the value directly as against having to then insert it in to the page to identify later.

Hope this makes things easier - let us know how you get on.  

Great to have you on board.

~Richard.


As you 

Please consider the environment before printing this email.


Find out more about Talis at www.talis.com

shared innovationTM


Any views or personal opinions expressed within this email may not be those of Talis Information Ltd or its employees. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited.


Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.

 
 

John Salter [University of Leeds]

unread,
Sep 10, 2009, 7:22:30 AM9/10/09
to juice-project-discuss
>Great to have you on board.
Great to be here! I was most impressed by the demo in Huddersfield -
but had to wait until we had a stable dev server to do anything with
it!

I'll also be looking at hooking Juice into an EPrints repository -
which will be a lot less painful than a III OPAC!

Reply all
Reply to author
Forward
0 new messages