Jep vs JPy

855 views
Skip to first unread message

Ananth Gundabattula

unread,
Aug 26, 2017, 10:00:36 PM8/26/17
to Jep Project
Hello All,

I was wondering if there is a distinct feature differences between Jep and JPy ? I am looking at integrating Apache Apex with python scoring logic to solve the use cases of Scikit-learn models and other python pickled code to be invoked via a streaming engine like Apache Apex. The calls are going to be triggered via the streaming engine and the responses collected back and emitted as a tuple to the downstream logic. The python interpreter embedded in the JVM would be helping in the scoring process for each data point that comes into the JVM from upstream logic. 

In this regard, it looks like there are two options I have since the framework would be optimising for low latencies. 

1. Jep - Embedded support which means I will be optimized for low latency execution. Shared memory and Numpy integration is a great fit
2. JPy - Looks like it is also claiming Numpy Support. But it is not clear what the architecture is. 

From the looks of it, both of these seem to be stating that they support java integration. What is not clear is which framework is a better fit for the low latency integration use case described above. 

Could someone from the community advise me regarding the strong points of each of the above frameworks ? 

Regards,
Ananth

Nathan Jensen

unread,
Aug 29, 2017, 10:58:33 AM8/29/17
to Jep Project
As a Jep developer I don't think I can provide an unbiased opinion, and I am not that familiar with jpy, though they share some features and some of their independently developed code is quite similar (there's only so many ways to write some of that code).  jpy is simpler than Jep in my opinion, which you could consider a disadvantage or an advantage.

With either project, if you're executing Python code inside the JVM, the Python GIL is going to be one of the biggest causes of slowdowns.  See https://github.com/ninia/jep/wiki/Jep-and-the-GIL

Glancing at the jpy code, it doesn't appear to be aggressively releasing the GIL like Jep does.  I'm presuming you were going for multiple threads?  I you're calling back and forth between Python and Java, then Jep will probably perform better due to it releasing the GIL on every Java call.  I you're calling into Python once, letting it do a bunch of calculations, then the GIL is going to get in your way except where numpy is releasing it (which is quite often thankfully).

I hope that helps.

--
You received this message because you are subscribed to the Google Groups "Jep Project" group.
To unsubscribe from this group and stop receiving emails from it, send an email to jep-project+unsubscribe@googlegroups.com.
To post to this group, send email to jep-p...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/jep-project/264c1f44-32ba-40de-ac2d-9825e85d2293%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply all
Reply to author
Forward
0 new messages