how to use jep in Spark in a parrallel way? like using mapPartition

39 views
Skip to first unread message

Yingchao Ji

unread,
Aug 12, 2021, 10:41:29 PM8/12/21
to Jep Project
Good day all,
I just wonder is there someone who experienced using jep in a parallel way in Spark? I am trying to use mapPartition, but it seems to Spark partition dataframes first then execute code in each partition. In this way it closes the jep instance before execution in each partition. is there a good way to resolve this?

example not work code:
val df_new=df.mapPartition(p=>
{
//instantiation before execution on each partition
val interp =new SharedInterpreter()
val res=p.map(r=>{
interp.eval(...)
interp.set(...)
inerp.exec(...)
}
)
//close the instance after execution on each partition
interp.close()
res
}
)

it turns out Spark close the jep instance before excution on each partition.

Best regards,
Jason

Smokeriu

unread,
Dec 14, 2021, 12:52:19 AM12/14/21
to Jep Project
This is due to spark's mechanics.
interp.close() run before p.map( yourcode ).
so you need to close interp in a iterator when you use mappatition
Reply all
Reply to author
Forward
0 new messages