Unable to launch DataProc Cluster

471 views
Skip to first unread message

Harish Vij

unread,
May 5, 2024, 11:42:01 AM5/5/24
to Google Cloud Dataproc Discussions
Hi Team, I am unable to make dataproc cluster from my free tier account.
Getting following error:
{
  "protoPayload": {
    "@type": "type.googleapis.com/google.cloud.audit.AuditLog",
    "status": {
      "code": 4,
      "message": "Multiple Errors:\n - Timeout waiting for instance [master instance] to report in.\n - Timeout waiting for instance [worker instance 0] to report in.\n - Timeout waiting for instance [worker instance 1] to report in."
    },
    "authenticationInfo": {
      "principalEmail": "[REDACTED]"
    },
    "serviceName": "dataproc.googleapis.com",
    "methodName": "google.cloud.dataproc.v1.ClusterController.CreateCluster",
    "resourceName": "projects/[PROJECT_ID]/regions/[REGION]/clusters/[CLUSTER_NAME]"
  },
  "insertId": "[REDACTED]",
  "resource": {
    "type": "cloud_dataproc_cluster",
    "labels": {
      "project_id": "[REDACTED]",
      "cluster_name": "[REDACTED]",
      "region": "[REDACTED]",
      "cluster_uuid": "[REDACTED]"
    }
  },
  "timestamp": "[REDACTED]",
  "severity": "ERROR",
  "logName": "projects/[PROJECT_ID]/logs/cloudaudit.googleapis.com%2Factivity",
  "operation": {
    "id": "[REDACTED]",
    "producer": "dataproc.googleapis.com",
    "last": true
  },
  "receiveTimestamp": "[REDACTED]"
}

Is there any change in policy regarding creation of dataproc cluster for free tier? I am trying to create 1 main and 2 worker node, with N1 series, n2-standar-4 machine type and 64 gb disk space.

P.S. : Ignore the redacted part, it is just to hide personal info.

Richard Holowczak

unread,
May 6, 2024, 9:50:45 AM5/6/24
to Google Cloud Dataproc Discussions
N1 machine types might not be available in your region. 
You can also check your Quotas and Limits to see if you are restricted for any resources.
You should also make sure your service account has dataproc.worker role.

My suggestion is to start with a basic single-node cluster.
Try using an inexpensive e2-standard-4 machine type.
You only need 100GB of disk at first.
Choose a reliable region (in the US if possible) where you have high confidence these resources will be available.

If you can get a basic single node cluster running, then delete it and try and again with a 3 node cluster.
Again start with smaller machine types and disks.


Cheers

Rich H.

Harish Vij

unread,
May 8, 2024, 2:08:07 PM5/8/24
to Google Cloud Dataproc Discussions
Hi Richard,
Thanks for the help. As you suggested I create a single node spark cluster with e2 machine. But Iwas getting the same error.

{
  "protoPayload": {
    "@type": "type.googleapis.com/google.cloud.audit.AuditLog",
    "status": {
      "code": 4,
      "message": "Timeout waiting for instance to report in."

    },
    "authenticationInfo": {
      "principalEmail": "[REDACTED]",
      "principalSubject": "[REDACTED]"
    },
    "requestMetadata": {
      "requestAttributes": {},
      "destinationAttributes": {}
    },
    "serviceName": "[REDACTED]",
    "methodName": "[REDACTED]",
    "resourceName": "[REDACTED]"

  },
  "insertId": "[REDACTED]",
  "resource": {
    "type": "cloud_dataproc_cluster",
    "labels": {
      "cluster_uuid": "[REDACTED]",
      "project_id": "[REDACTED]",
      "region": "[REDACTED]",
      "cluster_name": "[REDACTED]"

    }
  },
  "timestamp": "[REDACTED]",
  "severity": "ERROR",
  "logName": "[REDACTED]",
  "operation": {
    "id": "[REDACTED]",
    "producer": "[REDACTED]",

    "last": true
  },
  "receiveTimestamp": "[REDACTED]"
}

Timeout waiting for instance to report in. Interesting thing is that when I look in vm instances it was able to create the virtual machine but some how fail to report to cluster. Same thing happen when I was creating multiple node cluster. Is there any setting that is required to solve this reporting issue? Because I belive I am getiing allocated the VMs as it is using my credits but somehow they are unable to connect to my cluster.

Richard Holowczak

unread,
May 8, 2024, 2:13:02 PM5/8/24
to Google Cloud Dataproc Discussions
It might help if you post your equivalent gcloud command line (with your ProjectID redacted of course).
This way folks can see all of your configuration options.

Harish Vij

unread,
May 8, 2024, 2:48:25 PM5/8/24
to Google Cloud Dataproc Discussions
I am using console to create the cluster actually. How can i get the equivalent gcloud command?

Richard Holowczak

unread,
May 8, 2024, 3:51:47 PM5/8/24
to Google Cloud Dataproc Discussions
After you have configured everything using the console (but before you click the CREATE button), 
there will be a button below the create button labeled "equivalent command line".
Click on that button to view and copy the command line.
Paste that into this forum (and obscure your Project ID etc)



Cheers

R.

Harish Vij

unread,
May 9, 2024, 3:06:54 PM5/9/24
to Google Cloud Dataproc Discussions
gcloud dataproc clusters create hadoop-cluster --enable-component-gateway --region us-east1 --subnet default --no-address --single-node --master-machine-type e2-standard-2 --master-boot-disk-type pd-balanced --master-boot-disk-size 128 --image-version 2.2-debian12 --optional-components JUPYTER --project [PROJECT_NAME]


This is the equivalent command. I am getting the same error today also.

Richard Holowczak

unread,
May 12, 2024, 9:21:52 AM5/12/24
to Google Cloud Dataproc Discussions
It looks as if you are leaving the "Configure all instances to have only internal IP addresses” option checked.  It should probably be unchecked. That was a recent change (2 months ago) to that default.
Here is an example that works for me with minimal resources requirements:

gcloud dataproc clusters create cluster-1234 --enable-component-gateway --region us-central1 --single-node --master-machine-type e2-standard-4 --master-boot-disk-type pd-balanced --master-boot-disk-size 100 --image-version 2.2-debian12 --optional-components JUPYTER --max-idle 7200s --project [PROJECT_ID]

Reply all
Reply to author
Forward
0 new messages