./snowplow-storage-loader --config config.yml --skip analyze % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 930 100 930 0 0 1028k 0 --:--:-- --:--:-- --:--:-- 908k % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed100 930 100 930 0 0 1117k 0 --:--:-- --:--:-- --:--:-- 908kLoading Snowplow events and shredded types into sp-dev-redshift (Redshift cluster)...Unexpected error: Java::OrgPostgresqlUtil::PSQLException error executing COPY statements: BEGIN;COPY atomic.events FROM 's3://team/snowplow-dev/shredded/good/run=2016-03-19-04-10-50/atomic-events' CREDENTIALS 'aws_access_key_id=XXXXXXXXXXXX;aws_secret_access_key=XXXXXXXXXXXXXXXXXXXXXXXXXXX;token=AQoDYXdzEMn//////////wEa4AMwdu3GmnZK4VR7hT2eofDwqh/QdPeYV5KIHkyKswsueheIJbHbEVK6nA55iqGxsj+Ace6Ml8EmcwryVTe4Mh6UBRrD+rRgVXqhzUuP9oyjM0vEdB6fnExI0BOZLuh+KKJiVOPLb5SsNpaLFIpaC7sdmRblAUEOUiD1dLh0YBk5fgU6WImkWCnECvewU0RSySBhZFu6QqlWl0rX+DV08mgnUMmOXqZxCOk2CqF1CzySHRT5aMYx9s1UMj31PSYOR/pY9gQOCHAooZ3osoRz3WkE8hIxE76T0D7y9CL2k/OL+jZyQPGcRvYf53c63WGXEQWq4GKlVw6LPqf3VGf7X3AuBHZGWuBv4U2IQhZ8CAMsO1lc3dYkyaBYxZIa6/v6vUMk/YJWTpgYDAKmWAaDQa96X5Ue6pNDTgSGQM1kH1J4YRSadEC3yDpdV9hYBXs533mySQjAz0364P/EYWEOpVT3B7U49faTeKzdCLRNit7P/tPFdzxzfDRguNAQNK1wrDXDCXx6jRslh9idS8bwxWAkiqRQyCeR8F4Vpza6sTG8NfrNZ6Z3E7BqC4MjYIbEnerrFlmqgHzFumO60OcsVu0+2lUzZGWm58LFElxa2+aUSNooRXn2EvusEPFLqbKw+Ogg8da3twU=' REGION AS 'us-west-2' DELIMITER '\t' MAXERROR 1 EMPTYASNULL FILLRECORD TRUNCATECOLUMNS TIMEFORMAT 'auto' ACCEPTINVCHARS ;COMMIT;: ERROR: Out of memory Detail: ----------------------------------------------- error: Out of memory code: 1004 context: allocation failed, maximum supported size exceeded, requested size: 4294967197 query: 4881 location: alloc.cpp:1864 process: query0_321 [pid=30699] -----------------------------------------------
def get_credentials(config) if ( config[:aws][:access_key_id] == 'iam' && config[:aws][:secret_access_key] == 'iam' ) # this will definitely factor into the direct Redshift query requests, once we can get past the s3 session token issue(s) credentials_from_role = Aws::InstanceProfileCredentials.new.credentials "aws_access_key_id=#{credentials_from_role.access_key_id};aws_secret_access_key=#{credentials_from_role.secret_access_key};token=#{credentials_from_role.session_token}" else "aws_access_key_id=#{config[:aws][:access_key_id]};aws_secret_access_key=#{config[:aws][:secret_access_key]}" end endaws: access_key_id: XXXXXXXXXXXX secret_access_key: XXXXXXXXXXXX s3: region: us-west-2 buckets: assets: s3://snowplow-hosted-assets jsonpath_assets: s3://team/snowplow-dev/jsonpaths log: s3n://team/snowplow-dev/etl/logs raw: in: ["s3n://team/snowplow-dev/raw"] processing: s3://team/snowplow-dev/etl/processing archive: s3://team/snowplow-dev/archive/raw enriched: good: s3://team/snowplow-dev/enriched/good bad: s3://team/snowplow-dev/enriched/bad errors: archive: s3://team/snowplow-dev/enriched/archive shredded: good: s3://team/snowplow-dev/shredded/good bad: s3://team/snowplow-dev/shredded/bad errors: archive: s3://team/snowplow-dev/shredded/archive emr: ami_version: 4.3.0 region: us-west-2 jobflow_role: instance-profile service_role: role placement: ec2_subnet_id: subnet-a6374fc3 ec2_key_name: sp-dev-batchprocessor bootstrap: ["s3://team/proxy.sh"] software: hbase: lingual: jobflow: master_instance_type: m4.large core_instance_count: 2 core_instance_type: m4.2xlarge task_instance_count: 2 task_instance_type: m4.2xlarge task_instance_bid: bootstrap_failure_tries: 3 additional_info:collectors: format: thriftenrich: job_name: sp-dev-enrich versions: hadoop_enrich: 1.6.0 hadoop_shred: 0.8.0 hadoop_elasticsearch: 0.1.0 continue_on_unexpected_error: false output_compression: NONEstorage: download: folder: targets: - name: "sp-dev-redshift" type: redshift database: snowplow port: 5439 ssl_mode: disable table: atomic.events username: "snowplowdata" password: "XXXXXXXXXX" es_nodes_wan_only: maxerror: 1 comprows: 200000monitoring: tags: {} # Name-value pairs describing this job logging: level: DEBUG # You can optionally switch to INFO for productioniglu: schema: iglu:com.snowplowanalytics.self-desc/schema/jsonschema/1-0-0 data: cache_size: 1000 repositories: - name: "Iglu Central" priority: 0 vendor_prefixes: - com.snowplowanalytics connection: http: - name: "Our Iglu repository" priority: 5 vendor_prefixes: - com.company connection: http:--
You received this message because you are subscribed to the Google Groups "Snowplow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to snowplow-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
We solved it. It was actually unrelated to cluster size, WLM, etc. It was a combination of a missed step in Redshift setup and some horrible error messaging on Redshift's (or maybe Postgres's, not sure) part. Our storageloader user lost usage rights on either the atomic schema or atomic.events. We recreated them and it works fine now. /shrug
S--
--
No, just this single install from a single EC2 instance. The really weird thing is that we're only trying to load a few hundred to a few thousand records for testing purposes. We have a separate install in a separate AWS account running r72, but loading the exact same data via a second tracker/endpoint. That one's been running fine for months without incident. Did the Storage Loader change drastically between r72 and r77? Did the Redshift Copy statement change significantly?S
--
--
You received this message because you are subscribed to a topic in the Google Groups "Snowplow" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/snowplow-user/iMkm1FDuV_k/unsubscribe.
To unsubscribe from this group and all its topics, send an email to snowplow-use...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.