Variant transforms returns bug while running preprocessor algorithm

13 views
Skip to first unread message

Haroon Zeb

unread,
Jan 31, 2021, 10:14:27 PM1/31/21
to gcp-life-sci...@googlegroups.com


Dear GCP cloud life sciences team,
    i am working on variant transforms , Its preprocessor  returns error ,  I have checked many times  , i think is it a bug or some error

in addition ,i would like to share configuration of the file as well

export PROJECT=genomic-analysis-255208
export BUCKET=${PROJECT}-genomic-analysis-255208-vtgenomics  
gcloud config set project $PROJECT
\#!/bin/bash
\# Parameters to replace:
\# The PROJECT_ID is the name of the GCP project that contains your BigQuery dataset.


PROJECT=genomic-analysis-255208
INPUT_PATTERN=gs://genomic-analysis-255208-vtgenomics/sars-cov-2-gisaid.vcf 
REPORT_PATH=gs://genomic-analysis-255208-vtgenomics/report.tsv
RESOLVED_HEADERS_PATH=gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf 
TEMP_LOCATION=gs://genomic-analysis-255208-vtgenomics/temp 

COMMAND="vcf_to_bq_preprocess \ --input_pattern ${INPUT_PATTERN} \ --report_path ${REPORT_PATH} \ --resolved_headers_path ${RESOLVED_HEADERS_PATH} \ --report_all_conflicts true \ --temp_location ${TEMP_LOCATION} \ --job_name vcf-to-bigquery-preprocess \ --runner DataflowRunner" docker run -v ~/.config:/root/.config gcr.io/cloud-lifesciences/gcp-variant-transforms --project "${PROJECT}" --zones us-west1-b "${COMMAND}"

 Secondly , --zones flag is deprecated i checked it is worker-region,worker -zone for dataflow ,and if i replace --zone flag with region flag --region us-central1
then it returns error
getopt: unrecognized option '--zones' --project 'genomic-analysis-255208' -- 'us-west1-b' 'vcf_to_bq_preprocess --input_pattern gs://genomic-analysis-255208-vtgenomics/*.vcf --report_path gs://genomic-analysis-255208-vtgenomics/report.tsv --resolved_headers_path gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf --report_all_conflicts true --temp_location gs://genomic-analysis-255208-vtgenomics/temp --job_name vcf-to-bigquery-preprocess --runner DataflowRunner'

--project 'genomic-analysis-255208' --region 'us-central1' -- 'vcf_to_bq_preprocess --input_pattern gs://genomic-analysis-255208-vtgenomics/*.vcf --report_path gs://genomic-analysis-255208-vtgenomics/report.tsv --resolved_headers_path gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf --report_all_conflicts true --temp_location gs://genomic-analysis-255208-vtgenomics/temp --job_name vcf-to-bigquery-preprocess --runner DataflowRunner'Please set the temp_location using flag --temp_location.
i got the error which is under, I am requesting you to resolve the error

 --project 'genomic-analysis-255208' --region 'us-central1' -- 'vcf_to_bq_preprocess   --input_pattern gs://genomic-analysis-255208-vtgenomics/*.vcf   --report_path gs://genomic-analysis-255208-vtgenomics/report.tsv   --resolved_headers_path gs://genomic-analysis-255208-vtgenomics/resolved_headers.vcf   --report_all_conflicts true   --temp_location gs://genomic-analysis-255208-vtgenomics/temp   --job_name vcf-to-bigquery-preprocess   --runner DataflowRunner'
Please set the temp_location using flag --temp_location.

thanks in advance





Best regards,

Haroon Zeb Khan

Research Associate 

Department of Computer Science, 

University of Engineering and Technology Taxila, Pakistan

Cell #: +92-337-7146818,03028617679

Saman Vaisipour

unread,
Feb 1, 2021, 9:49:33 AM2/1/21
to Haroon Zeb, GCP Life Sciences Discuss
Hi Haroon,
You need to update your command as follows:

COMMAND="vcf_to_bq_preprocess \                                                                                                    
  --input_pattern ${INPUT_PATTERN} \
  --report_path ${REPORT_PATH} \
  --resolved_headers_path ${RESOLVED_HEADERS_PATH} \
  --report_all_conflicts true \
  --job_name vcf-to-bigquery-preprocess \
  --runner DataflowRunner"

docker run -v ~/.config:/root/.config   gcr.io/cloud-lifesciences/gcp-variant-transforms --project "${PROJECT}" --region us-west1 --temp_location ${TEMP_LOCATION} "${COMMAND}"
 
Note there are two changes:
  • Instead of --zones in your docker run command you need to set a GCP region using --region (here I used us-west1).
  • The temp_location flag must be set in the docker run command instead of your COMMAND.

Please follow these instructions for more information on how to run Variant Transforms's preprocessor. 

--
You received this message because you are subscribed to the Google Groups "GCP Life Sciences Discuss" group.
To unsubscribe from this group and stop receiving emails from it, send an email to gcp-life-sciences-...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/gcp-life-sciences-discuss/CAH0k0VWT05OvxfHXeuOra8-TJH6fQN9SxFdJVPJRvmFsCYgYBw%40mail.gmail.com.


--

Saman Vaisipour | Software Eng | sam...@google.com | 519-513-5756
 
Reply all
Reply to author
Forward
0 new messages