IndexedDB and Full Text Search

1,547 views
Skip to first unread message

Alec Flett

unread,
May 24, 2013, 2:29:50 PM5/24/13
to stora...@chromium.org
I wrote up this document to help explain the challenges with Full Text search and IndexedDB, and explore the possibility of adding primitives to support this use case better than we do today. This includes some discussions that we've had with Mozilla.


I welcome further discussion here from people who have (or don't have!) more experience with this stuff. if something interesting comes out of this, we can take some of the more concrete ideas to WhatWG/W3C.

Alec

David Barrett-Kahn

unread,
May 28, 2013, 11:02:26 AM5/28/13
to Alec Flett, stora...@chromium.org
Didn't mtrx have full text search based on IDB?

-Dave
--
-Dave

David Grogan

unread,
May 28, 2013, 4:02:30 PM5/28/13
to David Barrett-Kahn, Douglas Stockwell, Alec Flett, stora...@chromium.org
Doug, what do you remember?

Kyaw

unread,
Sep 26, 2013, 12:52:32 AM9/26/13
to stora...@chromium.org
I have initial implementation of fulltext search in javascript https://github.com/yathit/ydn-db-fulltext. it work pretty well http://dev.yathit.com/demo/ydn-db-text/pubmed-search/index.html 

I use separate object store for token index. Stemming, phonetic normalization and free search ranking are implemented.

Kyaw

Hoa V. Dinh

unread,
Sep 26, 2013, 1:04:40 AM9/26/13
to Kyaw, stora...@chromium.org
It sounds interesting.
Could you add in your README.md some code snippet about how to use it?

for example:

fts.open('fts_db_name')
fts.add('id3', some_text)
fts.add('id2', some_text)
document_ids = fts.search('dog');

Though, I guess your APIs are async.

-- 
Hoa V. Dinh

Kyaw

unread,
Sep 26, 2013, 2:08:04 AM9/26/13
to stora...@chromium.org, Kyaw
Hi Hoa!

I have updated readme file on project repo: https://github.com/yathit/ydn-db-fulltext 

BTW, is my demo app working to you  http://dev.yathit.com/demo/ydn-db-text/pubmed-search/index.html ? I have update and remove appcache. It is just frustrating that it always work on my computer but not others. 

Best,
Kyaw 

Hoa V. Dinh

unread,
Sep 26, 2013, 2:19:01 AM9/26/13
to Kyaw, stora...@chromium.org
Yes, it's working great.
It's probably missing an activity indicator when the indexing or search is in progress.

-- 
Hoa V. Dinh

Kyaw

unread,
Sep 26, 2013, 2:29:47 AM9/26/13
to stora...@chromium.org, Kyaw
Yes, it is possible. The search method return promise which has progress callback (addProgBack or notify in minified). During index scanning, raw inverted indexes are dispatched by the library. 

For simple app, it is not necessary, as you can see it takes around 20 ~ 100 ms for the demo app. 

Kyaw

unread,
Sep 26, 2013, 3:24:41 AM9/26/13
to stora...@chromium.org
On my experience, the most useful feature for fulltext search is not on linguistics, but on managing shadow object store. 


The main problem is to make consistent between original document object store and its inverted index shadow store. Since indexeddb, do not have event for changes a hook is required for all mutation requests. The propose function-based indexing will solve this problem without requiring shadow object store. 

I also think that, fulltext search api should not be easy, otherwise that feature will be abused - taking up valuable space on user browser. Basically this means, indexeddb api should provide only capability for efficiently storing index and not beyond.   

There are many interesting use case for this index key generator feature. For example, we can generate composite key for 1) sorting multiple fields with asc and dec mix, 2) reverse index and 3) compound key with multiEntry index. These are not currently possible. For these use case mutlitEntry may be set to false and simply return a generated key. Whereas, fulltext search index require multiEntry to set true and emit multiple keys.

Another use case is storing meta data as index. For example, caching file to indexeddb, we will also want to store its etag, so that we can invalidate the cache. Index key generator could be useful in this case. But since the generator function is not in local closure, passing extra data to the function will be not straight forward.

Kyaw  

On Saturday, May 25, 2013 2:29:50 AM UTC+8, Alec Flett wrote:
Reply all
Reply to author
Forward
0 new messages