How to run ScalaPB on Databricks / Spark ?

246 views
Skip to first unread message

Amir Ketata

unread,
Dec 27, 2019, 2:59:20 PM12/27/19
to ScalaPB
Hi,

I'm more of a data engineer not a stack developer. I would like to convert Protobuf stream and files using Databricks. I read that I can achieve reading Protobuf and converting them into JSON using ScalaPB. But how do I achieve that running ScalaPB on Databricks ?

FYI: Here is how to import libraries in databricks: https://docs.databricks.com/libraries.html

Thank you very much in advanced!

Nadav Samet

unread,
Dec 27, 2019, 3:21:55 PM12/27/19
to Amir Ketata, ScalaPB
Hi Amir,

The easiest way would be to build a single "fat jar" that contains everything that you need using SBT and load it into databricks. 
Assuming you have SBT installed locally, roughly the steps are


1. Create an sbt project with ScalaPB using:
sbt new scalapb/scalapb-template.g8

2. Add your protobufs into src/main/protobuf

3. Add scalpb-json4s to the project as a library dependency.

4. Add sbt-assembly to your project. See steps here
You may also need to shade Jackson in case you are getting some MethodNotFound errors.

5. Run "sbt assembly" to generate the fat jar.

6. Upload the generated JAR into databricks.

7. Try loading your protos using the generated code. Docs of JSON api here: https://scalapb.github.io/json.html

Hope it helps,
Nadav



--
You received this message because you are subscribed to the Google Groups "ScalaPB" group.
To unsubscribe from this group and stop receiving emails from it, send an email to scalapb+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/scalapb/a105dedd-5b31-4899-a324-ebed29bf22dc%40googlegroups.com.


--
-Nadav
Reply all
Reply to author
Forward
0 new messages