Rscript to CWL

212 views
Skip to first unread message

Dzu T

unread,
Sep 19, 2016, 6:21:56 PM9/19/16
to common-workflow-language
Hello,

   I am attempting to run an Rscript in CWL.  The R script takes a sample sheet input and creates barcode.txt and library_params.txt files and put them into 8 directories, one for each lane of a flow cell.  I can run the command okay on the shell prompt (yellow highlight in log below), but the same command in cwl-runner (blue highlight) could not load 'optparse' library specified in the R script (red highlight).

   I am also not sure how to specified the output folders and files generated by the Rscript in CWL.

   My r-MakeBarLibParTables-job_v2.yml  looks like this:

 dir: Run4_forR/

   And the r-MakeBarLibParTables.cwl looks like this:

cwlVersion: v1.0

class: CommandLineTool
baseCommand: Rscript
 
inputs:

  - id: dir
    type: string
    inputBinding:
      position: 2
      prefix: "--dir"
outputs:
  output:
    type: File
 
arguments:
   - valueFrom: "/home/ubuntu/MakeBarcodeTablesV3.R"
    position: 1

   Below is the debug log:

~$ cwl-runner --debug r-MakeBarLibParTables.cwl r-MakeBarLibParTables-job_v2.yml
/usr/local/bin/cwl-runner 1.0.20160811184335
[job r-MakeBarLibParTables.cwl] initializing from file:///home/ubuntu/r-MakeBarLibParTables.cwl
[job r-MakeBarLibParTables.cwl] {
    "dir": "Run4_forR/"
}
[job r-MakeBarLibParTables.cwl] path mappings is {}
[job r-MakeBarLibParTables.cwl] command line bindings is [
    {
        "position": [
            -1000000,
            0
        ],
        "datum": "Rscript"
    },
    {
        "position": [
            1,
            0
        ],
        "valueFrom": "/home/ubuntu/MakeBarcodeTablesV3.R"
    },
    {
        "position": [
            2,
            "dir"
        ],
        "prefix": "--dir",
        "datum": "Run4_forR/"
    }
]
[job r-MakeBarLibParTables.cwl] /data/tmp/tmp9YTLLT$ Rscript \
    /home/ubuntu/MakeBarcodeTablesV3.R \
    --dir \
    Run4_forR/

Error in library("optparse") : there is no package called 'optparse'
Execution halted
Error while running job: Error validating output record, could not validate field `output` because
  `[]`
   is not a dict

 in {
    "output": []

}
[job r-MakeBarLibParTables.cwl] completed permanentFail
[job r-MakeBarLibParTables.cwl] {}
Final process status is permanentFail
[job r-MakeBarLibParTables.cwl] Removing input staging directory /data/tmp/tmpUZ28CM
[job r-MakeBarLibParTables.cwl] Removing temporary directory /data/tmp/tmpdj2ina
[job r-MakeBarLibParTables.cwl] Removing empty output directory /data/tmp/tmp9YTLLT
Workflow error, try again with --debug for more information:
  Process status is ['permanentFail']
Traceback (most recent call last):
  File "/usr/local/lib/python2.7/dist-packages/cwltool/main.py", line 721, in main
    **vars(args))
  File "/usr/local/lib/python2.7/dist-packages/cwltool/main.py", line 226, in single_job_executor
    raise WorkflowException(u"Process status is %s" % (final_status))
WorkflowException: Process status is ['permanentFail']

Any help appreciated.  Thanks,

Dzung

Michael Crusoe

unread,
Sep 20, 2016, 8:34:55 AM9/20/16
to Dzu T, common-workf...@googlegroups.com
[including the mailing list on my reply]

On Tue, Sep 20, 2016 at 1:25 PM, Michael Crusoe <michael...@gmail.com> wrote:
On Tue, Sep 20, 2016 at 1:21 AM, Dzu T <dzu...@gmail.com> wrote:
Hello,

   I am attempting to run an Rscript in CWL.  The R script takes a sample sheet input and creates barcode.txt and library_params.txt files and put them into 8 directories, one for each lane of a flow cell.  I can run the command okay on the shell prompt (yellow highlight in log below), but the same command in cwl-runner (blue highlight) could not load 'optparse' library specified in the R script (red highlight).

Hello Dzung,

This mailing list is best for discussion about CWL standards development. We suggest asking questions about use on https://www.biostars.org/t/cwl/ 

In this particular case my guess would be that the 'optparse' R package was installed for just your user not all users.
 
   I am also not sure how to specified the output folders and files generated by the Rscript in CWL.

   My r-MakeBarLibParTables-job_v2.yml  looks like this:

 dir: Run4_forR/

   And the r-MakeBarLibParTables.cwl looks like this:

cwlVersion: v1.0

class: CommandLineTool
baseCommand: Rscript 
 
inputs:

  - id: dir
    type: string
    inputBinding:
      position: 2
      prefix: "--dir"
outputs:
  output:
    type: File
 
arguments:
   - valueFrom: "/home/ubuntu/MakeBarcodeTablesV3.R"
    position: 1

For fixed arguments that should appear right after the baseCommand you can include them in the baseCommand itself:

baseCommand: [ Rscript, /home/ubuntu/MakeBarcodeTablesV3.R ]

Of course that brings up concerns about portability of that path.

--
Michael R. Crusoe
Community Engineer & Co-founder
Common Workflow Language project
https://impactstory.org/u/0000-0002-2961-9670
michael...@gmail.com
+40 720 781 765
+1 480 627 9108



--
Michael R. Crusoe
Community Engineer & Co-founder
Common Workflow Language project
https://impactstory.org/u/0000-0002-2961-9670
michael...@gmail.com
+40 720 781 765
+1 480 627 9108
Reply all
Reply to author
Forward
0 new messages