PyPi install failure on new DB runtime

159 views
Skip to first unread message

Malcolm Charles

unread,
Nov 7, 2019, 10:29:31 AM11/7/19
to koalas-dev
Hello again,

I apologize if this isn't the appropriate place to ask, as I know this isn't strictly related to Koalas functionality. However - my team operates 2 databricks clusters.

On DBR 5.3 - the following command installs Koalas without issue: dbutils.library.installPyPI('koalas')

However, on DBR 5.5 LTS - I get a very unhelpful exit 1 code: 'org.apache.spark.SparkException: Process List(/local_disk0/pythonVirtualEnvDirs/virtualEnv-26c996c4-457b-4452-b62c-aae040a9843f/bin/python, /local_disk0/pythonVirtualEnvDirs/virtualEnv-26c996c4-457b-4452-b62c-aae040a9843f/bin/pip, install, koalas, --disable-pip-version-check) exited with code 1. Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-VDq08I/koalas/'

I'm a little out of my depth here and wondering if there are any dependencies I need to fix or if I need to stick with the DBR 5.3 for now.

Thank you,
Malcolm

Xiao Li

unread,
Nov 7, 2019, 12:50:19 PM11/7/19
to Malcolm Charles, koalas-dev
Hi, Malcolm, 

Thanks for using Koalas!  Have you tried to install Koalas using the Libraries tab on the cluster UI?

Thanks,

Xiao

--
You received this message because you are subscribed to the Google Groups "koalas-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to koalas-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/koalas-dev/7c116ec1-f244-45a1-b0d2-13fbc3f58a2f%40googlegroups.com.


--
Databricks Summit - Watch the talks 

Malcolm Charles

unread,
Nov 7, 2019, 12:52:31 PM11/7/19
to Xiao Li, koalas-dev
When I first started to use Koalas, I did this but it had unintended side effects on how other libraries behaved. I recall reading a Databricks blog or forum post that said it was better practice to load libraries within a notebook, rather than a cluster install. Has this changed?
Message has been deleted

Malcolm Charles

unread,
Nov 7, 2019, 3:53:42 PM11/7/19
to koalas-dev
I attempted to install Koalas using the cluster UI. Similar to before, the install succeeded on the cluster using DBR 5.3, but failed on the cluster using DBR 5.5 LTS.

The error I received in the cluster UI using DBR 5.5 LTS is:

java.lang.RuntimeException: ManagedLibraryInstallFailed: org.apache.spark.SparkException: Process List(/databricks/python/bin/pip, install, koalas, --disable-pip-version-check) exited with code 1. DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. ERROR: Complete output from command python setup.py egg_info: ERROR: Traceback (most recent call last): File "<string>", line 1, in <module> File "/tmp/pip-install-olcuTO/koalas/setup.py", line 43 file=sys.stderr) ^ SyntaxError: invalid syntax ---------------------------------------- ERROR: Command "python setup.py egg_info" failed with error code 1 in /tmp/pip-install-olcuTO/koalas/ for library:PythonPyPiPkgId(koalas,None,None,List()),isSharedLibrary=false


On Thursday, November 7, 2019 at 12:52:31 PM UTC-5, Malcolm Charles wrote:
When I first started to use Koalas, I did this but it had unintended side effects on how other libraries behaved. I recall reading a Databricks blog or forum post that said it was better practice to load libraries within a notebook, rather than a cluster install. Has this changed?

On Thu, Nov 7, 2019 at 12:50 PM Xiao Li <lix...@databricks.com> wrote:
Hi, Malcolm, 

Thanks for using Koalas!  Have you tried to install Koalas using the Libraries tab on the cluster UI?

Thanks,

Xiao

On Thu, Nov 7, 2019 at 7:29 AM Malcolm Charles <malcolm...@gmail.com> wrote:
Hello again,

I apologize if this isn't the appropriate place to ask, as I know this isn't strictly related to Koalas functionality. However - my team operates 2 databricks clusters.

On DBR 5.3 - the following command installs Koalas without issue: dbutils.library.installPyPI('koalas')

However, on DBR 5.5 LTS - I get a very unhelpful exit 1 code: 'org.apache.spark.SparkException: Process List(/local_disk0/pythonVirtualEnvDirs/virtualEnv-26c996c4-457b-4452-b62c-aae040a9843f/bin/python, /local_disk0/pythonVirtualEnvDirs/virtualEnv-26c996c4-457b-4452-b62c-aae040a9843f/bin/pip, install, koalas, --disable-pip-version-check) exited with code 1. Command &quot;python setup.py egg_info&quot; failed with error code 1 in /tmp/pip-install-VDq08I/koalas/'

I'm a little out of my depth here and wondering if there are any dependencies I need to fix or if I need to stick with the DBR 5.3 for now.

Thank you,
Malcolm

--
You received this message because you are subscribed to the Google Groups "koalas-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to koalas-dev+unsubscribe@googlegroups.com.

Takuya Ueshin

unread,
Nov 7, 2019, 4:52:23 PM11/7/19
to Malcolm Charles, koalas-dev
Hi Malcolm,

Thanks for reporting the issue and sharing the error message.

From the error message, looks like you are trying to use Python 2, but Koalas only supports Python >= 3.5.
Could you confirm and try with Python 3?

Thanks.


To unsubscribe from this group and stop receiving emails from it, send an email to koalas-dev+...@googlegroups.com.


--
Databricks Summit - Watch the talks 

--
You received this message because you are subscribed to the Google Groups "koalas-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to koalas-dev+...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/koalas-dev/f7a1e592-9cc8-41f7-ba56-e18f4dd6ca70%40googlegroups.com.

Malcolm Charles

unread,
Nov 7, 2019, 5:18:15 PM11/7/19
to Takuya Ueshin, koalas-dev
Aha.. Thanks for pointing that out.. It looks like we made a mistake in the cluster initialization. I don't think we've ever had the cluster on Python 2 intentionally, so I didn't think to check that. Sorry if I've needlessly taken up any time. Thank you for the help!

Reynold Xin

unread,
Nov 7, 2019, 5:19:12 PM11/7/19
to Malcolm Charles, koalas-dev, Takuya Ueshin
Thanks Malcolm. We will look into if we can make the error message better.


On Thu, Nov 07, 2019 at 2:18 PM, Malcolm Charles <malcolm...@gmail.com> wrote:
Aha.. Thanks for pointing that out.. It looks like we made a mistake in the cluster initialization. I don't think we've ever had the cluster on Python 2 intentionally, so I didn't think to check that. Sorry if I've needlessly taken up any time. Thank you for the help!

To unsubscribe from this group and stop receiving emails from it, send an email to koalas-dev+unsubscribe@googlegroups.com.


--
Databricks Summit - Watch the talks 

--
You received this message because you are subscribed to the Google Groups "koalas-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to koalas-dev+unsubscribe@googlegroups.com.

--
You received this message because you are subscribed to the Google Groups "koalas-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to koalas-dev+unsubscribe@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/koalas-dev/CAA5jRxKwbg1ombdwmiSasvNbKtLLZ86xs0Cxp6Dshb-2v4s90A%40mail.gmail.com.

Reply all
Reply to author
Forward
0 new messages