Hi,
We need some help. We want to use spark-thrift when team members connect via jdbc and run sqls like create/insert (any DDL/DML)
we have hive-kubernetes as a separate deployment
we have spark-image
when we create spark-thrift as a deployment and started the spark-thrift services (its ldap integrated) and when users submit those sqls, its not opening any driver/executor.. so when there are lot of queries are coming, spark-thrift services are failing with connection error.. I understand its happening because things are running local (Thats my assumption) within that pod..
I was looking online and saw ur custom operator and others..
If i download ur operator and install, can i do the following
2. applicationFileUrl: <<local path of the image (which i will download and put in the image)
3. Can i use "sparkConf and use these as an example
"spark.eventLog.dir": "gs://dev_spark_event_logs/"
"spark.eventLog.enabled": "true"
"spark.eventLog.logStageExecutorMetrics": "true"
"spark.jar.ivy": "/tmp"
"spark.hadoop.fs.gs.impl": "com.google.cloud.hadoop.fs.gcs.GoogleHadoopFileSystem"
"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.mount.path": /var/log/app_logs
"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.claimName": OnDemand
"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.storageClass": fast
"spark.kubernetes.executor.volumes.persistentVolumeClaim.data.options.sizeLimit": 2Gi
Note: This option intended is to go for dynamic volumes rather than the static way (your help says)
our spark version: 3.1.2. Will ur operator work for that..
I am happy to setup a call with you if you can provide ur email id and time (with timezone) u are available..
Apperciate your help..