Dear GCP cloud life sciences team,
i am working on variant transforms , Its preprocessor returns error , I have checked many times , i think is it a bug or some error
in addition ,i would like to share configuration of the file as well
export PROJECT=genomic-analysis-255208
export BUCKET=${PROJECT}-genomic-analysis-255208-vtgenomics
gcloud config set project $PROJECT
\#!/bin/bash
\# Parameters to replace:
\# The PROJECT_ID is the name of the GCP project that contains your BigQuery dataset.
PROJECT=genomic-analysis-255208
INPUT_PATTERN=gs://genomic-analysis-255208-vtgenomics/sars-cov-2-gisaid.vcf
REPORT_PATH=gs://genomic-analysis-255208-vtgenomics/report.tsv
RESOLVED_HEADERS_PATH=gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf
TEMP_LOCATION=gs://genomic-analysis-255208-vtgenomics/temp
COMMAND="vcf_to_bq_preprocess \ --input_pattern ${INPUT_PATTERN} \ --report_path ${REPORT_PATH} \ --resolved_headers_path ${RESOLVED_HEADERS_PATH} \ --report_all_conflicts true \ --temp_location ${TEMP_LOCATION} \ --job_name vcf-to-bigquery-preprocess \ --runner DataflowRunner"
docker run -v ~/.config:/root/.config gcr.io/cloud-lifesciences/gcp-variant-transforms --project "${PROJECT}" --zones us-west1-b "${COMMAND}"
Secondly , --zones flag is deprecated i checked it is worker-region,worker -zone for dataflow ,and if i replace --zone flag with region flag --region us-central1
then it returns error
getopt: unrecognized option '--zones' --project 'genomic-analysis-255208' -- 'us-west1-b' 'vcf_to_bq_preprocess --input_pattern gs://genomic-analysis-255208-vtgenomics/*.vcf --report_path gs://genomic-analysis-255208-vtgenomics/report.tsv --resolved_headers_path gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf --report_all_conflicts true --temp_location gs://genomic-analysis-255208-vtgenomics/temp --job_name vcf-to-bigquery-preprocess --runner DataflowRunner'
--project 'genomic-analysis-255208' --region 'us-central1' -- 'vcf_to_bq_preprocess --input_pattern gs://genomic-analysis-255208-vtgenomics/*.vcf --report_path gs://genomic-analysis-255208-vtgenomics/report.tsv --resolved_headers_path gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf --report_all_conflicts true --temp_location gs://genomic-analysis-255208-vtgenomics/temp --job_name vcf-to-bigquery-preprocess --runner DataflowRunner'Please set the temp_location using flag --temp_location.
i got the error which is under, I am requesting you to resolve the error
--project 'genomic-analysis-255208' --region 'us-central1' -- 'vcf_to_bq_preprocess --input_pattern gs://genomic-analysis-255208-vtgenomics/*.vcf --report_path gs://genomic-analysis-255208-vtgenomics/report.tsv --resolved_headers_path gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf --report_all_conflicts true --temp_location gs://genomic-analysis-255208-vtgenomics/temp --job_name vcf-to-bigquery-preprocess --runner DataflowRunner'
Please set the temp_location using flag --temp_location.
thanks in advance
Best regards,
Haroon Zeb Khan
Research Associate
Department of Computer Science,
University of Engineering and Technology Taxila, Pakistan
Cell #: +92-337-7146818,03028617679