Incremental META information

Paul Stubbe

unread,

Jun 11, 2014, 7:46:04 AM6/11/14

to dezi-...@googlegroups.com

Hi,

First of all: THANKS for the Dezi 3.0 release.

I have the following scenario:

A user can add a file to my repository

and then I add this file to the DEZI search infrastructure.

After this the file is never changed, except when the file is deleted at the end of its lifetime.

But not so with the META information about the file.

Paul Stubbe

unread,

Jun 11, 2014, 7:50:12 AM6/11/14

to dezi-...@googlegroups.com

The META information can grow with every user of the system.

Every user can add a 'tag' with a 'description' that can be used by others to search for something.

Every time someone adds some metainformation,

do I need to reload the Original file and all META information that was already loaded into the DEZI system

or is there a way to do this in an incremental way?

Thanks for any advice.

Paul

Peter Karman

unread,

Jun 13, 2014, 8:38:35 AM6/13/14

to dezi-...@googlegroups.com

Hi Paul,

Conceptually, a "document" in Dezi (or any IR library -- Lucene, Xapian,
etc) is whatever you make it. It could be a file on disk or a URI
(resource) or a the serialization of a db record or whatever. It's up to
you.

What Dezi is *not* is a relational database, so if you have metadata
about the document that you want stored, you must update the entire
document in the index. It's a "replace" not an "update". The HTTP method
is PUT for Dezi.

For what you're describing, though, it might make sense to have 2
indexes: one for the "file" and then one for each tag+description about
the file. While Dezi isn't relational in the sense of RDBMS, you can
still refer to things by their URI, just like HTML does.

You could serve these indexes from your Dezi instance with the
MultiTenant feature:

https://metacpan.org/release/Dezi-MultiTenant

with a (example, untested) config like:

{
'/files' => { }, # index-specific config here
'/tags' => {
indexer_config => {
config => {
MetaNames => 'file tag description author',
PropertyNames => 'file tag description author',
},
},
}

Example:

% cat myfile.xml
<doc>
<title>my document</title>
<body>important stuff</body>
</doc>

% dezi-client --server http://localhost:5000/files myfile.xml
# you can now GET http://localhost:5000/files/index/myfile.xml

% cat myfile-tag.xml
<doc>
<tag>foo</tag>
<description>the important stuff is very foo</description>
<author>someone</author>
<file>myfile.xml</file>
</doc>

% dezi-client --server http://localhost:5000/tags myfile-tag.xml
# you can now GET http://localhost:5000/tags/index/myfile-tag.xml

% dezi-client --server http://localhost:5000/tags -q file:myfile.xml
# returns all the tags for myfile.xml

--
Peter Karman . http://peknet.com/ . pe...@peknet.com

Reply all

Reply to author

Forward