Convert Lucene JAR file to webassembly

428 views
Skip to first unread message

chite...@gmail.com

unread,
Mar 6, 2018, 3:38:17 AM3/6/18
to TeaVM
Hi all,

I just discovered webassembly and found it very exciting.  Is it possible to convert a JAR file to webassembly using TeaVM?  One particular project I am interested in is Lucene (https://lucene.apache.org).  They have a demo of creating one JAR file (https://lucene.apache.org/core/2_9_4/demo.html). 

If TeaVM can be used for making Lucene avaliable in a web browser, it will be fantastic.

Thanks

Chi

Alexey Andreev

unread,
Mar 6, 2018, 3:45:24 AM3/6/18
to te...@googlegroups.com

> I just discovered webassembly and found it very exciting. Is it
> possible to convert a JAR file to webassembly using TeaVM?
Hello! No, currently it's impossible. WebAssembly support is in
experimental status, it does not support all of the features of JS
backend. Another option is converting Lucene to JavaScript, but this
requires some investigation. TeaVM does not support 100% Java (for
example, it has limited support for reflection and does not support
class loaders). Please, note that TeaVM does not work like you expect
(i.e. converting jar files). Instead, it's intended to convert
applications, i.e. set of jar and class files with single entry point
(main method). You won't be able to call anything except for main method
from JavaScript, unless you wrap all Java APIs to JavaScript-friendly
declarations. Note that the latter case is not supported directly,
you'll have to use some hacks to expose Java APIs to JavaScript. The
whole path is not easy, so if you really going to spend some time to
learn TeaVM and investigate if it fits your needs, let me know and I'll
provide you with further instructions.

chite...@gmail.com

unread,
Mar 6, 2018, 4:34:56 AM3/6/18
to TeaVM
Thanks for the prompt reply.  I am looking for a text database solution for webassembly.  Lucene/Solr is the standard.  Would be great if some day Lucene/Solr can be implemented in webassembly.  Then we will have client based a search engine solution. 

Keep up the good works!

Nathaniel Cohen

unread,
Jan 14, 2021, 4:55:58 AM1/14/21
to TeaVM
I'm also very interested in porting Lucene to WebAssembly! I'm wondering if anything has changed in the last 3 years? If nothing has changed, is there any plan that these sort of projects could potentially be supported in the near future?

Also, I came across the JWebAssembly library and I was wondering if that could solve the issues you mention here.

Thanks

ScraM Team

unread,
Jan 15, 2021, 8:11:15 PM1/15/21
to TeaVM
Chi,

Can you tell us more about your requirements?  I'm curious what sort of system needs a client-side full text index. 

Do you need all of the stemming and other fancy search options?  You might be able to get away with regex searches in your data model.

Nathaniel Cohen

unread,
Jan 18, 2021, 5:45:17 AM1/18/21
to TeaVM
I use CouchDB as a backend database (CouchDB uses Lucene for full text). This DB lets the user replicate his DB so the mobile app can work offline (all the queries can be made offline without access to the backend).

Since I have a webapp that directly calls CouchDB and a mobile app that does everything locally, I would like the results to be consistent between the webapp and the mobile app. There are other solutions for indexing in javascript but they are not as feature complete as Lucene.

Le me know if that makes sense.

ScraM Team

unread,
Jan 18, 2021, 8:25:36 PM1/18/21
to TeaVM
That is intriguing.  I have lots of questions (but not too many answers).

Does CouchDB have pluggable search/indexing solutions?  Maybe there is a simpler-to-compile search plugin. 

Is this a single-user system?  How does replication avoid sending data from other users to the client side?

Is there a subset of Lucene that doesn't have lots of raw file access?  That's going to be one of the sticking points of building Lucene for the client, I would imagine.

Hopefully some of the above helps.  Please keep posting as you get further.

Nathaniel Cohen

unread,
Jan 19, 2021, 4:52:13 AM1/19/21
to TeaVM
Actually there are two different types of indexing. One for querying (like SQL with limited features) and the other one is Lucene (which doesn't come out of the box due to licensing restrictions). AFAIK, there isn't any other pluggable solution beside Lucene.

CouchDB is currently designed to work as one DB per user and can manage thousands/millions of DBs. There is advanced work in progress for "per document" access which would allow to use one DB for multiple users.

I'm not sure I understand your question regarding raw files. Is it to have access to the file system to save indexes? It seems that although the FileSystem API was discontinued by the W3C, it is supported by most browsers. I'm not sure if this is what you meant or if this API would be the solution.

I've also heard that sync vs async could be challenging. Emscripten has ASYNCIFY. Does TeaVM have an equivalent? What about garbage collector or performance?

Reply all
Reply to author
Forward
0 new messages