understand the config.yml file

37 views
Skip to first unread message

Brijesh Singh

unread,
Mar 31, 2016, 11:16:34 PM3/31/16
to Snowplow
Hello There,

I have successfully deploy the snowplow a month ago, and it's working fine for me, I used javascript tracker, android tracker, node js tracker with emretlrunner and using redshift.
now I have to understand some configuration of config.yml file.
enrich:
  job_name: bikroy-snowplow-emr-etl-runner # Give your job a name
  versions:
    hadoop_enrich: 1.5.0 # Version of the Hadoop Enrichment process
    hadoop_shred: 0.6.0 # Version of the Hadoop Shredding process
    hadoop_elasticsearch: 0.1.0 # Version of the Hadoop to Elasticsearch copying process
  continue_on_unexpected_error: true # Set to 'true' (and set :out_errors: above) if you don't want any exceptions thrown from ETL


in the enrich section I am putting continue_on_unexpected_error: true  what will happen if I put "false" instead true.
please clarify my doubt regarding this configuration?

2.And also I have to understand the /config/enrichments/javascript_script_enrichment.jsonthis JSON file configuration.

"data": {

        "vendor": "com.snowplowanalytics.snowplow",
        "name": "javascript_script_config",
        "enabled": false,
        "parameters": {
            "script": "ZnVuY3Rpb24gcHJvY2VzcyhldmVudCkgew0KDQogIHZhciBwbGF0Zm9ybSA9IGV2ZW50LmdldFBsYXRmb3JtKCksDQogICAgICBhcHBJZCAgICA9IGV2ZW50LmdldEFwcF9pZCgpOw0KDQogIGlmIChwbGF0Zm9ybSA9PSAic2VydmVyIiAmJiBhcHBJZCAhPSAic2VjcmV0Iikgew0KICAgIHRocm93ICJTZXJ2ZXItc2lkZSBldmVudCBoYXMgaW52YWxpZCBhcHBfaWQ6ICIgKyBhcHBJZDsNCiAgfQ0KICANCiAgaWYgKGFwcElkID09IG51bGwpIHsNCiAgICByZXR1cm4gW107DQogIH0NCg0KICAvLyBVc2UgbmV3IFN0cmluZygpIGJlY2F1c2UgaHR0cDovL25lbHNvbndlbGxzLm5ldC8yMDEyLzAyL2pzb24tc3RyaW5naWZ5LXdpdGgtbWFwcGVkLXZhcmlhYmxlcy8NCiAgdmFyIGFwcElkVXBwZXIgPSBuZXcgU3RyaW5nKGFwcElkLnRvVXBwZXJDYXNlKCkpOw0KDQogIHJldHVybiBbIHsgc2NoZW1hOiAiaWdsdTpjb20uYWNtZS9mb28vanNvbnNjaGVtYS8xLTAtMCIsDQogICAgICAgICAgICAgICBkYXRhOiB7IGFwcElkVXBwZXI6IGFwcElkVXBwZXIgfQ0KICAgICAgICAgICB9IF07DQp9"
        }
    }


I am using "enabled": false, what happen if I put it as "enabled": true
these two doubt I have could you please give me detail explanation regarding this?

Thanks
Brijesh Singh


Ihor Tomilenko

unread,
Mar 31, 2016, 11:48:58 PM3/31/16
to Snowplow
Hi Brijesh,

1. If continue_on_unexpected_error is set to false then during the enrichment process the EMR job will terminate with an error when unexpected error / exception is raised. In contrast, if set to true an error message will be sent to the error bucket and the job will continue to run if still possible.

2. What happens depends on the script you provide to the script parameter in that JSON. What you are referring to is just an example JSON. You have to provide your own Base64 encoded JavaScript function. Please, read the following if you want to know more:

Regards,
Ihor

Brijesh Singh

unread,
Apr 1, 2016, 2:58:16 AM4/1/16
to Snowplow
Thanks Ihor,

Is there any way to set the threshold for error so that after reaching those threshold we can stop the process.

Ihor Tomilenko

unread,
Apr 1, 2016, 2:26:07 PM4/1/16
to Snowplow
Hi Brijesh,

Unfortunately not. It's not something we can control. This feature is a built-in to Cascading, the Hadoop processing framework we use.

Regards,
Ihor
Reply all
Reply to author
Forward
0 new messages