Guys,
I've read most of the discussion about emr-etl-runner but still not able to prepare a proper config.yml.
So i will really appreciate your help me solving this issue.
this is the message that i received after trying to run the application :
Value guarded in: Snowplow::EmrEtlRunner::Cli::load_config
With Contract: Maybe, String => Hash
At: /root/snowplow-emr-etl-runner!/emr-etl-runner/lib/snowplow-emr-etl-runner/cli.rb:134 ):
/root/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts.rb:69:in `Contract'
org/jruby/RubyProc.java:271:in `call'
/root/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts.rb:147:in `failure_callback'
/root/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/decorators.rb:164:in `common_method_added'
/root/snowplow-emr-etl-runner!/gems/contracts-0.7/lib/contracts/decorators.rb:159:in `common_method_added'
file:/root/snowplow-emr-etl-runner!/emr-etl-runner/bin/snowplow-emr-etl-runner:37:in `(root)'
org/jruby/RubyKernel.java:1091:in `load'
file:/root/snowplow-emr-etl-runner!/META-INF/main.rb:1:in `(root)'
org/jruby/RubyKernel.java:1072:in `require'
file:/root/snowplow-emr-etl-runner!/META-INF/main.rb:1:in `(root)'
/tmp/jruby4467930534781662455extract/jruby-stdlib-1.7.20.1.jar!/META-INF/jruby.home/lib/ruby/shared/rubygems/core_ext/kernel_require.rb:1:in `(root)'
Below is my config.yml :
aws:
access_key_id:
secret_access_key:
s3:
region: eu-west-1
buckets:
assets: s3://vd-snowplow-etl-assets/
log: s3://vd-snowplow-etl/logs/
raw:
in:
- s3://vd-snowplow-etl-logfiles/
processing: s3://vd-snowplow-etl/processing/
archive: s3://vd-snowplow-etl-archive/
enriched:
good: s3://vd-snowplow-etl/enriched/good/
bad: s3://vd-snowplow-etl/enriched/bad/
errors: s3://vd-snowplow-etl/enriched/errors/
shredded:
good: s3://vd-snowplow-etl/shredded/good/
bad: s3://vd-snowplow-etl/shredded/bad/
errors: s3://vd-snowplow-etl/shredded/errors/
emr:
ami_version: 4.3.0 # Choose as per
http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-plan-ami.html region: eu-west-1 # Always set this
placement: # Set this if not running in VPC. Leave blank otherwise
ec2_subnet_id: subnet-7083921b # Set this if running in VPC. Leave blank otherwise
ec2_key_name: vd-com-aws-test-key
# Adjust your Hadoop cluster below
bootstrap: [] # Set this to specify custom boostrap actions. Leave empty otherwise
software:
hbase: "0.92.0" # Optional. To launch on cluster, provide version, "0.92.0", keep quotes. Leave empty otherwise.
lingual: "1.1"
jobflow:
master_instance_type: m1.small
core_instance_count: 2
core_instance_type: m1.small
task_instance_count: 0 # Increase to use spot instances
task_instance_type: m1.small
task_instance_bid: 0.015
etl:
job_name: Snowplow ETL # Give your job a name
hadoop_etl_version: 0.5.0 # Version of the Hadoop Enrichment process
collector_format: clj-tomcat # Or 'clj-tomcat' for the Clojure Collector
continue_on_unexpected_error: false # You