DSL2/Nextflow workflow: issue passing variable to process/"java.lang.NullPointerException"

1,023 views
Skip to first unread message

Kristin Muench

unread,
May 7, 2021, 5:40:02 AM5/7/21
to Nextflow
Hello,
I am a new user of Nextflow, and I am trying to convert my pipeline to a DSL2/Nextflow workflow. Unfortunately, I'm running into errors - I think with passing files into the process - and I don't know what I'm doing wrong. Is anyone able to spot what I'm doing incorrectly?

Traditionally, I would run this pipeline in a bash script:

# # # # # # # 

INPUT_FILE=/path/to/myfile.fa

./python_script1.py --input ${INPUT_FILE} --output_basename ${NAME}# this script saves a .csv and another .txt when it runs
./python_script2.py --input ${NAME}.txt --output_basename ${NAME} 

# # # # # # # 

My current Nextflow script looks like:

# # # # # # # 

// TEST PATHS
params.output_basename="myname"
params.input_file="/path/to/myfile.fa"

// additional setup
nextflow.enable.dsl=2

// write status into log
log.info "input fasta file : ${params.input_file}"

// set up channels
channel
.fromPath( params.input_file )
.ifEmpty { error "Cannot find any fasta files matching: ${params.input_file}" }
.set { fa_file }

//processes
process python_script1{

input:
val(fa) from fa_file

output:
val("./${output_basename}/swap_candidates/${output_basename}.txt" emit: first_text)
val("./${output_basename}/swap_candidates/${output_basename}.csv" emit: first_csv)

script:
"""
echo "Script 1 is running..."
./python_script1.py --input $fa --output_basename ${params.output_basename}
"""
}

include { python_script2 } from './nf_modules/python_script2/main'

workflow {
python_script1(fa_file)
python_script2(python_script1.out.first_text)
}


# # # # # # # 

When I attempt to run it, nothing appears in the work directory, and I get this error:

Launching `my_workflow.nf` [happy_sax] - revision: 68489e048a
General error during parsing: java.lang.NullPointerException

java.lang.NullPointerException
at org.apache.groovy.parser.antlr4.AstBuilder.validateExpressionListElement(AstBuilder.java:3420)
at org.apache.groovy.parser.antlr4.AstBuilder.visitExpressionListElement(AstBuilder.java:3399)
at org.apache.groovy.parser.antlr4.AstBuilder.visitEnhancedArgumentListElement(AstBuilder.java:2619)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.apache.groovy.parser.antlr4.AstBuilder.visitEnhancedArgumentListInPar(AstBuilder.java:2556)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCommandExpression(AstBuilder.java:2057)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCommandExprAlt(AstBuilder.java:2028)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCommandExprAlt(AstBuilder.java:341)
at org.apache.groovy.parser.antlr4.GroovyParser$CommandExprAltContext.accept(GroovyParser.java:8051)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:48)
at org.apache.groovy.parser.antlr4.GroovyParserBaseVisitor.visitExpressionStmtAlt(GroovyParserBaseVisitor.java:402)
at org.apache.groovy.parser.antlr4.GroovyParser$ExpressionStmtAltContext.accept(GroovyParser.java:6754)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:20)
at org.apache.groovy.parser.antlr4.AstBuilder.visit(AstBuilder.java:4218)
at org.apache.groovy.parser.antlr4.AstBuilder.visitLabeledStmtAlt(AstBuilder.java:1016)
at org.apache.groovy.parser.antlr4.AstBuilder.visitLabeledStmtAlt(AstBuilder.java:341)
at org.apache.groovy.parser.antlr4.GroovyParser$LabeledStmtAltContext.accept(GroovyParser.java:6721)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:20)
at org.apache.groovy.parser.antlr4.AstBuilder.visit(AstBuilder.java:4218)
at org.apache.groovy.parser.antlr4.AstBuilder.visitBlockStatement(AstBuilder.java:4017)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.apache.groovy.parser.antlr4.AstBuilder.visitBlockStatements(AstBuilder.java:4006)
at org.apache.groovy.parser.antlr4.AstBuilder.visitBlockStatementsOpt(AstBuilder.java:3993)
at org.apache.groovy.parser.antlr4.AstBuilder.visitClosure(AstBuilder.java:3655)
at org.apache.groovy.parser.antlr4.AstBuilder.visitClosureOrLambdaExpression(AstBuilder.java:3981)
at org.apache.groovy.parser.antlr4.AstBuilder.visitPathElement(AstBuilder.java:2404)
at org.apache.groovy.parser.antlr4.AstBuilder.lambda$createPathExpression$34(AstBuilder.java:4308)
at java.util.stream.ReduceOps$1ReducingSink.accept(ReduceOps.java:80)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:474)
at org.apache.groovy.parser.antlr4.AstBuilder.createPathExpression(AstBuilder.java:4304)
at org.apache.groovy.parser.antlr4.AstBuilder.visitPathExpression(AstBuilder.java:2254)
at org.apache.groovy.parser.antlr4.AstBuilder.visitPostfixExpression(AstBuilder.java:2733)
at org.apache.groovy.parser.antlr4.AstBuilder.visitPostfixExpression(AstBuilder.java:341)
at org.apache.groovy.parser.antlr4.GroovyParser$PostfixExpressionContext.accept(GroovyParser.java:8092)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:48)
at org.apache.groovy.parser.antlr4.GroovyParserBaseVisitor.visitPostfixExprAlt(GroovyParserBaseVisitor.java:162)
at org.apache.groovy.parser.antlr4.GroovyParser$PostfixExprAltContext.accept(GroovyParser.java:8173)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:20)
at org.apache.groovy.parser.antlr4.AstBuilder.visit(AstBuilder.java:4218)
at org.apache.groovy.parser.antlr4.AstBuilder.visitExpressionListElement(AstBuilder.java:3397)
at org.apache.groovy.parser.antlr4.AstBuilder.visitEnhancedArgumentListElement(AstBuilder.java:2619)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
at java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:418)
at org.apache.groovy.parser.antlr4.AstBuilder.visitEnhancedArgumentListInPar(AstBuilder.java:2556)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCommandExpression(AstBuilder.java:2057)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCommandExprAlt(AstBuilder.java:2028)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCommandExprAlt(AstBuilder.java:341)
at org.apache.groovy.parser.antlr4.GroovyParser$CommandExprAltContext.accept(GroovyParser.java:8051)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:48)
at org.apache.groovy.parser.antlr4.GroovyParserBaseVisitor.visitExpressionStmtAlt(GroovyParserBaseVisitor.java:402)
at org.apache.groovy.parser.antlr4.GroovyParser$ExpressionStmtAltContext.accept(GroovyParser.java:6754)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visitChildren(AbstractParseTreeVisitor.java:48)
at org.apache.groovy.parser.antlr4.GroovyParserBaseVisitor.visitScriptStatement(GroovyParserBaseVisitor.java:466)
at org.apache.groovy.parser.antlr4.GroovyParser$ScriptStatementContext.accept(GroovyParser.java:471)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:20)
at org.apache.groovy.parser.antlr4.AstBuilder.visit(AstBuilder.java:4218)
at org.apache.groovy.parser.antlr4.AstBuilder.lambda$visitScriptStatements$0(AstBuilder.java:476)
at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1376)
at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:481)
at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:471)
at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
at org.apache.groovy.parser.antlr4.AstBuilder.visitScriptStatements(AstBuilder.java:477)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCompilationUnit(AstBuilder.java:434)
at org.apache.groovy.parser.antlr4.AstBuilder.visitCompilationUnit(AstBuilder.java:341)
at org.apache.groovy.parser.antlr4.GroovyParser$CompilationUnitContext.accept(GroovyParser.java:317)
at groovyjarjarantlr4.v4.runtime.tree.AbstractParseTreeVisitor.visit(AbstractParseTreeVisitor.java:20)
at org.apache.groovy.parser.antlr4.AstBuilder.visit(AstBuilder.java:4218)
at org.apache.groovy.parser.antlr4.AstBuilder.buildAST(AstBuilder.java:424)
at org.apache.groovy.parser.antlr4.Antlr4ParserPlugin.buildAST(Antlr4ParserPlugin.java:58)
at org.codehaus.groovy.control.SourceUnit.buildAST(SourceUnit.java:257)
at java.util.Iterator.forEachRemaining(Iterator.java:116)
at java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
at java.util.stream.ReferencePipeline$Head.forEach(ReferencePipeline.java:580)
at org.codehaus.groovy.control.CompilationUnit.buildASTs(CompilationUnit.java:666)
at org.codehaus.groovy.control.CompilationUnit.compile(CompilationUnit.java:632)
at groovy.lang.GroovyClassLoader.doParseClass(GroovyClassLoader.java:389)
at groovy.lang.GroovyClassLoader.lambda$parseClass$3(GroovyClassLoader.java:332)
at org.codehaus.groovy.runtime.memoize.StampedCommonCache.compute(StampedCommonCache.java:163)
at org.codehaus.groovy.runtime.memoize.StampedCommonCache.getAndPut(StampedCommonCache.java:154)
at groovy.lang.GroovyClassLoader.parseClass(GroovyClassLoader.java:330)
at groovy.lang.GroovyShell.parseClass(GroovyShell.java:526)
at groovy.lang.GroovyShell.parse(GroovyShell.java:538)
at groovy.lang.GroovyShell.parse(GroovyShell.java:570)
at nextflow.script.ScriptParser.parse(ScriptParser.groovy:172)
at nextflow.script.ScriptParser.parse(ScriptParser.groovy:191)
at nextflow.script.ScriptParser.parse(ScriptParser.groovy:196)
at nextflow.script.ScriptRunner.parseScript(ScriptRunner.groovy:198)
at nextflow.script.ScriptRunner.execute(ScriptRunner.groovy:118)
at nextflow.cli.CmdRun.run(CmdRun.groovy:302)
at nextflow.cli.Launcher.run(Launcher.groovy:475)
at nextflow.cli.Launcher.main(Launcher.groovy:657)


I have two questions:
1. Am I correct in thinking that all my inputs and outputs in this case should be "val" because ultimately I'm passing a string into a command line command that in turn passes that value into Python?

2. I'm interpreting the error above as python_script1 isn't able to load fa_file. Is this correct? Why might this be?

Thank you for your help,
Kristin

Dan

unread,
May 9, 2021, 5:28:35 PM5/9/21
to next...@googlegroups.com
Hi Kristin,

There are a few critical things:
1. When you create the channel with fromPath, the path gets stripped and val(fa) will only contain the file name without the path. You need file(fa) in the process input. 

2. For the output, if you specify --output_basename ${params.output_basename} in the script section, your output file would be written directly into that directory and you don't need an output channel (in fact, the output channel may not be able to find the output file).
The output channel is required, however, if you have a following process that takes this output as its input. In which case, the nextflow way is to write out the output in the current folder (something equivalent of --output_basename . ) and add publishDir in the process: https://www.nextflow.io/docs/latest/process.html#publishdir

In output section, I don't know whether
val("./${output_basename}/swap_candidates/${output_basename}.csv", emit: first_csv) 
would work, but at least it needs to be
val("./${output_basename}/swap_candidates/${output_basename}.csv"), emit: first_csv
This part is a bit hard to explain in a few sentences, please see my notes here: https://github.com/danrlu/Nextflow_cheatsheet#the-working-directory

3. In DSL2, the input gets passed into a process through 
workflow {
python_script1(fa_file)
python_script2(python_script1.out.first_text)
}
So in the process definition, all you need is
input:
file(fa) from fa_file

It takes a bit of fiddling when everyone gets started. Hang in there.
Dan

DISCLAIMER: The information found in this communication and any attachment(s) may contain confidential or privileged information and is intended solely for the use of the individual or entity to which it is addressed. Eligo Bioscience cannot guarantee the integrity of this message and shall not be liable for the message if altered, changed or falsified. While Eligo Bioscience takes all reasonable precautions to ensure that viruses are not transmitted via emails, Eligo Bioscience recommends that you take your own measures to prevent viruses from entering your computer system. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited. If you have received this communication in error, please notify Eligo Bioscience immediately by responding to this email and then delete it from your system.

--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+u...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/nextflow/dcd5c21f-e411-4b12-8659-be772e2f5672n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages