Google Cloud - working example

346 views
Skip to first unread message

Lisle Mose

unread,
Mar 4, 2019, 3:30:41 PM3/4/19
to Nextflow
Hi,

I'm running into the error below when trying to use the  Google Pipelines API:

Any idea what I am doing wrong? (gsutil can ls the work dir and input file)

Is there a concrete working example of Nextflow running against GCP?

Command and error:

nextflow run test.nf -work-dir gs://test_project_data/nextflow-test/work --fq gs://test_project_data/nextflow-test/test_1.fastq
N E X T F L O W  ~  version 19.01.0
Launching `test.nf` [sad_lumiere] - revision: cbe409dcbb
ERROR ~ gs:// URIs must have a host: gs://test_project_data/nextflow-test/work

 -- Check '.nextflow.log' file for details

nextflow.config:
process {
    executor = 'google-pipelines'
    container = 'm0zack/rna_seq_test'
}

cloud {
    instanceType = 'n1-standard-2'
}

google {
    project = 'my-proj-name'
    zone = 'us-east1-b'
}

#!/usr/bin/env nextflow

params.fq = '/datastore/nextgenout4/seqware-analysis/lmose/nextflow/rna_seq_test/data/test/test_1.fastq'

process word_count {
    cpus 1
    memory '2 GB'
    
    output:
    file 'wc.txt' into word_count

    """
      wc -l ${params.fq} > wc.txt
      cat wc.txt
    """
}

Ethan Cerami

unread,
Mar 6, 2019, 9:43:16 PM3/6/19
to Nextflow
Lisle,

I have not yet successfully gotten NextFlow to work on Google Pipelines API, but I also encountered the same error, and I just found out a fix.

It seems that if you rename your bucket to not include _ or - then it works.  For example, instead of test_project_data, try creating a bucket called testproject321.

Not sure why that works, but give it a try?

Ethan

Lisle Mose

unread,
Mar 7, 2019, 9:52:33 AM3/7/19
to next...@googlegroups.com
Thanks for the pointer.  That still did not work for me:

nextflow run test.nf -work-dir gs://lcccnextflowtest2/work --fq gs://lcccnextflowtest2/test1.fastq
N E X T F L O W  ~  version 19.01.0
Launching `test.nf` [modest_agnesi] - revision: cbe409dcbb
ERROR ~ Cannot a find a file system provider for scheme: gs

 -- Check '.nextflow.log' file for details
--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
Visit this group at https://groups.google.com/group/nextflow.
For more options, visit https://groups.google.com/d/optout.

Paolo Di Tommaso

unread,
Mar 7, 2019, 9:55:18 AM3/7/19
to nextflow
It looks you didn't specified the google application credentials properly. 

Please include the .nextflow.log file for more info. 

p

Ethan Cerami

unread,
Mar 7, 2019, 10:06:01 AM3/7/19
to Nextflow
Perhaps, now try running:

export NXF_MODE=google

then, try running again?

Ethan

Lisle Mose

unread,
Mar 7, 2019, 10:53:10 AM3/7/19
to next...@googlegroups.com
Ah, yes - I opened a new terminal and neglected to set GOOGLE_APPLICATION_CREDENTIALS.  After setting this and removing underscores from the bucket name, my simple test has worked and I do see the expected output in the work dir.

It looks like the original issue is with underscores in the bucket name (dashes appear to be OK).  

So, the following does NOT work:
nextflow run test.nf -work-dir gs://lccc_nextflow_test3/work --fq gs://lccc_nextflow_test3/test_1.fastq
N E X T F L O W  ~  version 19.01.0
Launching `test.nf` [golden_sanger] - revision: 3ab717f7aa
ERROR ~ gs:// URIs must have a host: gs://lccc_nextflow_test3/work

The following DOES work:
nextflow run test.nf -work-dir gs://lccc-nextflow-test/work --fq gs://lccc-nextflow-test/test_1.fastq
N E X T F L O W  ~  version 19.01.0
Launching `test.nf` [jolly_ptolemy] - revision: 3ab717f7aa
[warm up] executor > google-pipelines
[43/4ce875] Submitted process > word_count (1)

Thanks,
Lisle

Ethan Cerami

unread,
Mar 7, 2019, 11:08:35 AM3/7/19
to next...@googlegroups.com, Lisle Mose
Great.  you have made it further than me then!

Can I ask if you are using a service account?  and, what roles you
added to your service account?

My pipeline runs, but it keeps failing because it does not seem to be
able to write to the google bucket (I think).

Ethan

--
Ethan Cerami

On March 7, 2019 at 10:53:11 AM, Lisle Mose (lisle...@gmail.com) wrote:
> Ah, yes - I opened a new terminal and neglected to
> set GOOGLE_APPLICATION_CREDENTIALS. After setting this and removing
> underscores from the bucket name, my simple test has worked and I do see
> the expected output in the work dir.
>
> It looks like the original issue is with underscores in the bucket name
> (dashes appear to be OK).
>
> So, the following does *NOT* work:
> nextflow run test.nf -work-dir gs://lccc_nextflow_test3/work --fq gs://
> *lccc_nextflow_test3*/test_1.fastq
> N E X T F L O W ~ version 19.01.0
> Launching `test.nf` [golden_sanger] - revision: 3ab717f7aa
> ERROR ~ gs:// URIs must have a host: gs://lccc_nextflow_test3/work
>
> The following *DOES* work:
> nextflow run test.nf -work-dir gs://lccc-nextflow-test/work --fq gs://
> *lccc-nextflow-test*/test_1.fastq
> N E X T F L O W ~ version 19.01.0
> Launching `test.nf` [jolly_ptolemy] - revision: 3ab717f7aa
> [warm up] executor > google-pipelines
> [43/4ce875] Submitted process > word_count (1)
>
> Thanks,
> Lisle
>
> >>>> *Command and error:*
> >>>>
> >>>> nextflow run test.nf -work-dir
> >>>> gs://test_project_data/nextflow-test/work --fq
> >>>> gs://test_project_data/nextflow-test/test_1.fastq
> >>>> N E X T F L O W ~ version 19.01.0
> >>>> Launching `test.nf` [sad_lumiere] - revision: cbe409dcbb
> >>>> ERROR ~ gs:// URIs must have a host:
> >>>> gs://test_project_data/nextflow-test/work
> >>>>
> >>>> -- Check '.nextflow.log' file for details
> >>>>
> >>>> *nextflow.config:*
> >>>> process {
> >>>> executor = 'google-pipelines'
> >>>> container = 'm0zack/rna_seq_test'
> >>>> }
> >>>>
> >>>> cloud {
> >>>> instanceType = 'n1-standard-2'
> >>>> }
> >>>>
> >>>> google {
> >>>> project = 'my-proj-name'
> >>>> zone = 'us-east1-b'
> >>>> }
> >>>>
> >>>> *test.nf :*

Lisle Mose

unread,
Mar 7, 2019, 11:53:15 AM3/7/19
to Nextflow
Yes, I'm using a service account with Role of "Editor"

Can you spin up an instance using the service account and write to your bucket outside of Nextflow?
Reply all
Reply to author
Forward
0 new messages