I'm trying to trigger a cloud function to create a new dataproc cluster (not a workflow template) in the event of pub/sub topic which gets message whenever any dataproc cluster creation gets failed with status code of numeric digit apart from zero(status_code!=0) in cloud logging (dataproc activity log).
I have written a python code for the above scenario but this cloud function crashes immediately after triggering by pub/sub.
Would you please check what is wrong in this code and what needs to be modified based on the above scenario to execute successfully.
```
import base64
import json
import googleapiclient.discovery
from google.cloud import dataproc_v1 as dataproc
def dataproc_workflow(event, context):
"""
Triggered by a Cloud Pub/Sub message containing a Dataproc status code != 0
audit activity Stackdriver log message
"""
pubsub_message = base64.b64decode(event['data']).decode('utf-8')
msg_json = json.loads(pubsub_message)
proto_payload = msg_json['protoPayload']
resource_name = proto_payload['resourceName']
email = proto_payload['authenticationInfo']['principalEmail']
client = dataproc.ClusterControllerClient()
create_cluster(client, project_id, zone, region, cluster_name)
print(f"Cluster created: {cluster_name}.")
def create_cluster(client, project_id, zone, region, cluster_name):
print('Creating cluster...')
cluster_data = {
'project_id': *********************,
'cluster_name': simple,
'region' : ******************,
'config': {
'config_bucket': ************,
'gce_cluster_config': {
'zone_uri': ********************,
'subnetwork_uri': ********************,
'internal_ip_only': true,
'service_account_scopes': [
'https://www.googleapis.com/auth/cloud-platform'
],
'tags': [
'dataproc-rule',
]
},
'master_config': {
'num_instances': 1,
'machine_type_uri': 'n1-standard-1'
},
'worker_config': {
'num_instances': 2,
'machine_type_uri': 'n1-standard-1'
}
}
}
response = client.create_cluster(project_id, region, cluster_data)
result = response.result()
print("After cluster create")
return result
Also advise , is this the ideal solution for achieving high availability of dataproc cluster over zonal failure ? . If any best practise , please advise.
Thanks,
Deena