Google Cloud Dataproc updates - 2015/10/15

17 views
Skip to first unread message

Hadoop on Google Cloud Platform Team

unread,
Oct 16, 2015, 1:38:58 AM10/16/15
to Hadoop on Google Cloud Platform Team, gcp-hadoo...@googlegroups.com

Hello everyone,


Recently we launched Google Cloud Dataproc in beta as an easy to use, fast, and cost-effective managed Spark and Hadoop service. Since Cloud Dataproc is a Spark/Hadoop-related product on Google Cloud Platform, we will be sending updates to this list when we release updates to the Cloud Dataproc service or update the Spark/Hadoop components deployed on Dataproc clusters.


Today we released an update to the Cloud Dataproc service with a number of fixes, enhancements, and optimizations. Here are the release notes for this 10/15/2015 release.


Bugfixes

  • Fixed a bug where DataNodes failed to register with the NameNode on startup, resulting in less HDFS capacity than expected.

  • Jobs cannot be submitted to clusters in an Error state.

  • Clusters which would not cleanly delete in some cases now properly delete upon request.

  • Reduced HTTP 500 errors when deploying Cloud Dataproc clusters.

  • Corrected distcp out-of-memory errors with better cluster configuration.

  • Fixed a situation where jobs would fail to properly delete and would get stuck in a Deleting state.


Core service improvements

  • HTTP 500 errors with more detail about the error are shown instead of 4xx errors.

  • Resource already exists errors show more detail about which resources already exist.

  • Errors related to Google Cloud Storage display specific information instead of a generic error message.

  • Listing operations support pagination.


Optimizations

  • Significantly improved YARN utilization for MapReduce jobs running directly against Cloud Storage.

  • Adjustments to yarn.scheduler.capacity.maximum-am-resource-percent enable better utilization and concurrent job support.

The Cloud Dataproc release notes will serve as a consolidated list of all release notes from our beta launch forward. You can learn more about Cloud Dataproc on the Google Cloud Platform site.


Best,


Google Cloud Dataproc / Google Cloud Spark & Hadoop Team


Reply all
Reply to author
Forward
0 new messages