JNI in Dataflow

111 views
Skip to first unread message

Stepan Bujnak

unread,
Jan 29, 2018, 7:40:03 PM1/29/18
to Google Cloud Developers

I need to use JNI in my Dataflow pipeline. The JNI uses C++ library that has a ton of external dependencies on other system libraries. What would be the best way to make sure that the libraries are where they should be in the operating system when a worker runs the DoFn that uses the C++ library?

I found that the DataflowPipelineOptions.setWorkerHarnessContainerImage might allow me to specify custom docker image from the Google Container Registry that I could potentially install bunch of libraries on, but the documentation doesn't say much more. Are there any requirements for the docker image in terms of installed packages, entry points, etc...?

George (Cloud Platform Support)

unread,
Jan 29, 2018, 10:11:09 PM1/29/18
to Google Cloud Developers
Hello Stepan, 

It is possible to use JNI with C++ libraries without employing DataflowPipelineOptions.setWorkerHarnessContainerImage: by using the --filesToStage option to stage any files to the worker VMs, as needed. From the workers you will then be able to use them, as needed by your app. All jars needed in your code are to be included, and named within the list passed to the --filesToStage option.

You may refer to the "GoogleCloudPlatform/DataflowJavaSDK" online document for detail and the actual code.

Stepan Bujnak

unread,
Jan 30, 2018, 12:19:27 AM1/30/18
to Google Cloud Developers
Hi George,

I already considered the option to stage the dependencies (libraries) using the mechanics you described. The problem is that the library I'm trying to use has lots of weird dependencies and it would be much easier to just install the files as I normally would (apt-get install ...) inside a docker container. That's why I looked at the setWorkerHarnessContainerImage method, but couldn't find more details.

George (Cloud Platform Support)

unread,
Jan 30, 2018, 10:12:40 PM1/30/18
to Google Cloud Developers
If you install all dependencies, weird as they might be, the proposed solution should work. 

For coding and programming problems, you are better served by publishing in forums such as stackoverflow, where competent programmers are ready to help with a piece of advice. This group is rather meant for general voicing of opinions, comments and specific news. 
Reply all
Reply to author
Forward
0 new messages