Google Cloud Dataproc - January 21st release

16 views
Skip to first unread message

James Malone

unread,
Jan 25, 2016, 1:17:19 PM1/25/16
to gcp-hadoo...@googlegroups.com

Hello everyone,


Late last week we released a new set of updates to Google Cloud Dataproc.


New features

  • The dataproc command in the Google Cloud SDK now includes a --properties option for adding or updating properties in some cluster configuration files, such as core-site.xml. Properties are mapped to configuration files by specifying a prefix, such as "core:io.serializations". The following are supported prefixes and their mappings:

    • core - core-site.xml
    • hdfs - hdfs-site.xml
    • mapred - mapred-site.xml
    • yarn - yarn-site.xml
    • hive - hive-site.xml
    • pig - pig.properties
    • spark - spark-defaults.conf

For example, to change the spark.master value in the spark-defaults.conf file, you would specify the following property:


spark:spark.master=spark://example.com


You can separate multiple properties with a comma. Each property must be specified in the full file:key=value format. Some properties are reserved cannot be overridden because they would impact the functionality of the Cloud Dataproc cluster. For more information, see the Cloud Dataproc documentation for the --properties command.


  • Google Developers Console

    • An option has been added to the “Create Clusters” form to enable the cloud-platform scope for a cluster. This lets you view and manage data across all Google Cloud Platform services from Cloud Dataproc clusters. You can find this option by expanding the “Preemptible workers, bucket, network, version, initialization, & access options” section at the bottom of the form.


Bugfixes

  • SparkR jobs no longer immediately fail with a “permission denied” error (Spark JIRA issue)

  • Configuring logging for Spark jobs with the --driver-logging-levels option no longer interferes with Java driver options

  • Google Developers Console

    • The error shown for improperly-formatted initialization actions now properly appears with information about the problem

    • Very long error messages now include a scrollbar so the Close button remains on-screen


Connectors and documentation

The Cloud Dataproc release notes also contain these notes and all past release notes.


Best,


Google Cloud Dataproc / Google Cloud Spark & Hadoop Team

Reply all
Reply to author
Forward
0 new messages