How to change Python version for Pyspark Action Plugin

251 views
Skip to first unread message

shreyasi misra

unread,
Sep 21, 2022, 1:43:20 AM9/21/22
to CDAP User
I am running cdap sandbox locally on my system. 
My system (Fedora 36) has Python3.10 by default which the pyspark action plugin is utilizing.
However, due to updates in this new version of python, collections package has been affected.
When I try to run the default code in Pyspark plugin, i.e:

from pyspark import *
from pyspark.sql import *
from cdap.pyspark import SparkExecutionContext

sec = SparkExecutionContext()
sc = SparkContext()

I get the error : 
ImportError: cannot import name 'MutableMapping' from 'collections' (/usr/lib64/python3.10/collections/__init__.py)
(I have attached the log from this error below)

If I try to correct that by importing packages as a workaround, I get stuck at:
ImportError: cannot import name 'Sequence' from 'collections' (/usr/lib64/python3.10/collections/__init__.py)

I have installed Python3.9 and wanted to direct the python path to this in cdap.

How could I direct the Python path to 3.9 version?
Thank you
default-67e4bbb7-396f-11ed-953a-3af35b0f370b.log

Christophe DIEDHIOU

unread,
Sep 21, 2022, 2:55:52 AM9/21/22
to cdap...@googlegroups.com
Hi how are you,
I think the python plugin in cdap has no version 3.xx see the last python version of the plugin and adapt your script

--
You received this message because you are subscribed to the Google Groups "CDAP User" group.
To unsubscribe from this group and stop receiving emails from it, send an email to cdap-user+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/cdap-user/ca5a96d8-59a5-425d-84f3-c1f56e5e2fd9n%40googlegroups.com.
Message has been deleted

shreyasi misra

unread,
Sep 21, 2022, 3:31:04 AM9/21/22
to CDAP User
Hi, I am good, thanks. Hope you are well too.

This is the source code for the plugin I deployed from the Hub

Could you direct me a little on how to check the python version and make the necessary changes to make my code work?
Thanks

Christophe DIEDHIOU

unread,
Sep 21, 2022, 3:39:29 AM9/21/22
to cdap...@googlegroups.com
Check in cdap documentation if it is there.

shreyasi misra

unread,
Sep 21, 2022, 4:25:02 AM9/21/22
to CDAP User
I did check there but couldn't find anything related to this. Any other leads?

Christophe DIEDHIOU

unread,
Sep 21, 2022, 5:48:18 AM9/21/22
to cdap...@googlegroups.com
Do this to find the jar you have
ls cdap-home/data/namespaces/the_name_of_namespace/artifacts/python-transform/

shreyasi misra

unread,
Sep 26, 2022, 4:57:15 AM9/26/22
to CDAP User
Hi, Thanks for the help.
However, I had to go through a different route for it. Thus, posting it here in case someone else is also stuck.
-Install the required python version on your system
-Install a virtual environment library like virtualenv
-Create a virtual env referencing the python version to create one with
-Add the required python packages for pyspark action plugin
-Activate virtual env, start cdap sandbox

Now, the processes will use the python and packages made available in the virtual environment.

Reply all
Reply to author
Forward
0 new messages