ZoieIndexable responsability

13 views
Skip to first unread message

Roman_Garcia

unread,
Oct 12, 2011, 10:34:39 PM10/12/11
to zo...@googlegroups.com
Hi,

I can't figure out why a ZoieIndexable can respond to messages:
getUID
getIndexingReq[]

To me, that means that many documents can (and will, based on all DataConsumer implementation I've seen so far) map to a single UID.
Which means, following events with same UID will delete all of these documents from the index.

The reason I'm asking this is because I would like to have each DataEvent to be able to generate multiple Documents, and to each Document to be able to indicate it's own UID (I guess also isDeleted() and isSkip() would make sense)

I guess what I want is just, at indexing time, to be able to explode 1 single event (a Kafka message in my case) into several Lucene documents, each with its own unique ID, that way optimizing my DataProvider load whenever I can.

I understand this is a whole new API, really different to the current ZoieIndexable
In fact, it would probably mean moving the whole ZoieIndexable interface to the IndexingReq at some point.

Is there a reason why this wouldn't make any sense at all? Does anyone foresee a future issue with this?
Could there be any interest for such a "feature"?

Regards,
Roman

bob xu

unread,
Oct 13, 2011, 10:27:06 AM10/13/11
to zo...@googlegroups.com

"To me, that means that many documents can (and will, based on all DataConsumer implementation I've seen so far) map to a single UID.
Which means, following events with same UID will delete all of these documents from the index."


 Maybe the reason is that  one LinkedIn user's Profile is a Document(we can call it Zoie Doc) with UID ,  the Zoie Doc is consist of sevral docs which you call them "multiple Documents" will be indexed.
 The "Document Picture" maybe like this:
   image.png

2011/10/13 Roman_Garcia <romang...@gmail.com>
--
You received this message because you are subscribed to the Google Groups "zoie" group.
To view this discussion on the web visit https://groups.google.com/d/msg/zoie/-/AXohrflrJ0kJ.
To post to this group, send email to zo...@googlegroups.com.
To unsubscribe from this group, send email to zoie+uns...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/zoie?hl=en.

image.png

Roman Garcia

unread,
Oct 13, 2011, 11:53:59 AM10/13/11
to zo...@googlegroups.com
Wow! Thanks a lot Bob!
That does make sense...but still...wouldn't make even more sense to be able to do both?

I mean:
1 event could produce 1 UID - 1 document
1 event could produce 1 UID - N documents
1 event could produce N UID - N documents

You could have that if every Document were allowed a UID of its own, something like:

IndexingReq {
   getUid
   getDocument
   getAnalyzer
   isDelete
   isSkip
}

Makes sense?
Roman
image.png
Reply all
Reply to author
Forward
0 new messages