Announcing bdutil-0.35.1, gcs-connector-1.2.8, and misc updates to the other Hadoop connectors

242 views
Skip to first unread message

Hadoop on Google Cloud Platform Team

unread,
Aug 8, 2014, 5:24:16 PM8/8/14
to gcp-had...@google.com, gcp-hadoo...@googlegroups.com

Greetings, users of Hadoop on Google Cloud Platform!


We're pleased to announce new versions of bdutil, the GCS connector for Hadoop, the BigQuery connector for Hadoop and the Datastore connector for Hadoop with bugfixes and minor improvements. Download bdutil-0.35.1.tar.gz or bdutil-0.35.1.zip now to try it out, or visit the developer documentation where the download links now point to the latest version.

bdutil


There were several small usability changes to bdutil. The most significant is a change in the VM naming conventions used. The default VM prefix is now ”hadoop”, and the master and worker suffixes are now ’m’ and ’w’ respectively. To use bdutil-0.35.1, with any clusters spun up with a previous version you will need to add the --old_hostname_suffixes  flag. Without the --old_hostname_suffixes flag, bdutil will not operate on old/existing clusters deployed with older bdutil versions.

GCS Connector Binaries


You can download the connector jarfile directly, gcs-connector-1.2.8-hadoop1.jar for use with Hadoop 1 (and other versions of the same series, like 0.20.205.0), and gcs-connector-1.2.8.-hadoop2.jar for use with Hadoop 2 or allow bdutil to perform the installation and configuration for you without having to deal directly with jarfiles.


As always, please send any questions or comments to gcp-hadoo...@google.com


All the best,

Your Google Team



gcs-connector-1.2.8: CHANGES.txt

1.2.8 - 2014-08-07


 1. Changed the manner in which the GCS connector jar is built to A) reduce

    included dependencies to only those parts which are used and B) repackaged

    dependencies whose versions conflict with those bundled with Hadoop.

 2. Deprecated fs.gs.system.bucket config.


bdutil-0.35.1: CHANGES.txt

0.35.1 - 2014-08-07


 1. Added a boolean bdutil option DEBUG_MODE with corresponding flags

    -D/--debug which turns on high-verbosity modes for gcutil and gsutil

    calls during the deployment, including on the remote VMs.

 2. Added the ability for the Google connectors, bdconfig, and Hadoop

    distributions to be stored and fetched from gs:// style URLs in addition

    to http:// URLs.

 3. In VERBOSE_MODE, on failure the detailed debuginfo.txt is now also printed

    to the console in addition to being available in the /tmp directory.

 4. Moved all configuration templates into conf/.

 5. Changed the default PREFIX to 'hadoop' instead of 'hs-ghfs', and the

    naming convention for masters/workers to follow $PREFIX-m and $PREFIX-w-$i

    instead of $PREFIX-nn and $PREFIX-dn-$i.  IMPORTANT: This breaks

    compatibility with existing clusters deployed with bdutil 0.34.x and older,

    but there is a new flag "--old_hostname_suffixes" to continue using the old

    -nn/-dn naming convention. For example, to turn

    down an old cluster if you've been using the default prefix:

    ./bdutil --prefix=hs-ghfs --old_hostname_suffixes delete

 6. Fixed a bug in VM environments where run_command could not find commands

    such as 'hadoop' in their PATH.

 7. Update BigQuery / Datastore sample scripts to be used with

    "./bdutil run_command." rather than locally.

 8. Added a test to guarantee VMs had no more than 64 characters in their fully

    qualified domain names.

 9. Added the import_env helper to allow "_env.sh" files to depend on each

    other.

 10. Renamed spark1_env.sh to spark_env.sh.

 11. Added a gsutil update check upon first entering a VM.


datastore-connector-0.14.6: CHANGES.txt

0.14.6 - 2014-08-07


 1. Misc updates in library dependencies.


bigquery-connector-0.4.3: CHANGES.txt

0.4.3 - 2014-08-07


 1. Added better validation to BigQueryUtils.getSchemaFromString used by

    BigQueryOutputFormat to throw descriptive IllegalArgumentExceptions

    instead of NullPointerExceptions for most types of malformed schemas.

 2. Fixed a bug in BigQueryUtils.getSchemaFromString to support 'repeated'

    fields inside of nested records; used to throw IllegalStateException

    if a nested record contained more than 1 inner field.

Reply all
Reply to author
Forward
0 new messages