Filtering on Dublin Core metadata possible?

10 views
Skip to first unread message

David

unread,
Aug 12, 2009, 9:25:03 AM8/12/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Dublin Core metadata uses '.' in the element names, such as DC.title
or DC.date.created. In the past, we've been unable to access this
metadata directly due to the way the '.' is used as an 'and' in
metadata calls. double encoding doesn't seem to work.

We can use a wild card to get the metadata and post-process the
results, but we really want to pre-filter based on specific metadata
elements and values.

Does version 6 improve on this in any way, or does someone have a
workaround?

Joe D'Andrea

unread,
Aug 12, 2009, 10:36:58 AM8/12/09
to Google-Search-...@googlegroups.com
Greetings!

On Wed, Aug 12, 2009 at 9:25 AM, David<david...@hhs.gov> wrote:

> Dublin Core metadata uses '.' in the element names, such as DC.title
> or DC.date.created.  In the past, we've been unable to access this
> metadata directly due to the way the '.' is used as an 'and' in
> metadata calls. double encoding doesn't seem to work.

<!>

> We can use a wild card to get the metadata and post-process the
> results, but we really want to pre-filter based on specific metadata
> elements and values.

Hmm. Does %20 work as a separator? For instance, "DC%20date%20created" - ?

Or I may be misunderstanding. If you have a specific search example
where this breaks down, post it here and we'll put it under the
proverbial microscope.

p.s. - I heart Dublin Core. :)

--
Joe D'Andrea
Liquid Joe LLC | Google Enterprise Partner
www.liquidjoe.biz | skype:joedandrea | +1 (908) 781-0323

playdough

unread,
Aug 21, 2009, 2:01:01 PM8/21/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi David,

We've run into the same problem. Our institution uses GSA V5.2. We've
modified the separator from a dot to a dash (e.g. DC-creator instead
of DC.creator). In our tests, we've found that both the dot and colon
are parsed incorrectly by the GSA in metadata searches. The dash
however, seems to function.

If you send me a email message (click the "view profile" link next to
my name), I can send you a link to a test page with multiple search
forms. The forms are proof of concept using test metadata pages we've
loaded into our GSA.

As far as I understand, it's not a required spec to use the dot
separator, but it is recommended by the Dublin Core Metadata
Initiative. Of course, the world would be a better place if every
agency and organization implemented DC in the same format. If you've
seen the requirement for using the dot separator listed on its site
(DC.creator, etc.), please post a link, I'd be very interested in
seeing it.

I'm looking into when Google plans on fixing the issue, or if it's
already fixed with V6. What version of the GSA are you using?

Ignatius

justin.brister

unread,
Aug 22, 2009, 8:04:12 AM8/22/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
You need to encode the . using %2e in the search box

On the URL string you double encode as %252e

The same principle applies for - or any other special characters in
the query

J

Arthur Winailan

unread,
Aug 24, 2009, 9:22:02 AM8/24/09
to Google-Search-...@googlegroups.com
Hi David,

You can use inmeta for searching Dublin Core metatag. We uses Dublin Core metatags such as: "Overheid.OrganisationType" and "DC.Title". We are able to search this meta by using the character "%2E" as the substitute for the "." dot.

For example: "nieuws inmeta:OVERHEID%2EorganisationType=inspectie" or "propofol inmeta:DC%2ETitle=Nieuwsberichten"
Good luck.

Arthur
--
Arthur Winailan
NewSync Technologies
Google Enterprise Search Consultant

playdough

unread,
Aug 24, 2009, 12:46:46 PM8/24/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Thank you for the help on this issue. It solves the problem.

In our test search forms, an example of the input field (with starting
and closing angle brackets, is now:
input type="hidden" name="partialfields" value="DC%2epublisher:[name
of department]" /

Ignatius

On Aug 22, 8:04 am, "justin.brister" <justin.bris...@googlemail.com>
wrote:

Labendi.SEOservices - Belinda

unread,
Aug 26, 2009, 10:29:36 AM8/26/09
to Google Search Appliance/Google Mini - Google Search Appliance/Google Mini
Hi Arthur,

For a customer we are am implementing Dublin Core - OWMS standards
(netherlands) and we want to implement the Google Mini. I want to be
sure that the indexing software handles the meta elements well. Can
you tell me more about this?

greetz
Belinda

Arthur Winailan

unread,
Aug 27, 2009, 4:38:38 AM8/27/09
to Google-Search-...@googlegroups.com
Hi Belinda,

Unfortunately The Google Mini can't index Meta Tags like the GSA. But it's possible to filter the documents by metatags.

Cheers,

Arthur
Reply all
Reply to author
Forward
0 new messages