Spark-Thrift

36 views

Skip to first unread message

Narasimha Addala

unread,

Jun 3, 2022, 3:06:05 PM6/3/22

to DataRoaster

Hi,

We need some help. We want to use spark-thrift when team members connect via jdbc and run sqls like create/insert (any DDL/DML)

we have hive-kubernetes as a separate deployment

we have spark-image

when we create spark-thrift as a deployment and started the spark-thrift services (its ldap integrated) and when users submit those sqls, its not opening any driver/executor.. so when there are lot of queries are coming, spark-thrift services are failing with connection error.. I understand its happening because things are running local (Thats my assumption) within that pod..

I was looking online and saw ur custom operator and others..
If i download ur operator and install, can i do the following

1. image: <<my spark image >>(which will include the jar u mentioned https://github.com/cloudcheflabs/spark/releases/download/v3.0.3/spark-thrift-server-3.0.3-spark-job.jar;)

2. applicationFileUrl: <<local path of the image (which i will download and put in the image)
3. Can i use "sparkConf and use these as an example

"spark.eventLog.dir": "gs://dev_spark_event_logs/"

"spark.eventLog.enabled": "true"

"spark.eventLog.logStageExecutorMetrics": "true"

"spark.jar.ivy": "/tmp"

"spark.hadoop.fs.gs.impl": "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem"

"spark.hadoop.fs.gs.project.id": dv-edai-data-processing

"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.path": /var/log/app_logs

"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.claimName": OnDemand

"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.storageClass": fast

"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.sizeLimit": 2Gi

Note: This option intended is to go for dynamic volumes rather than the static way (your help says)

our spark version: 3.1.2. Will ur operator work for that..

I am happy to setup a call with you if you can provide ur email id and time (with timezone) u are available..