Invalid process input definition

183 views
Skip to first unread message

gary...@gmail.com

unread,
Jun 7, 2018, 11:37:11 PM6/7/18
to Nextflow

Hi all, I'm now to nextflow and I'm wondering if this is the right way to generate data for each group size and then run inference for each dateset generated? I'm wondering if there's syntax error in generateData and runInference, since it's giving me ERROR ~ Invalid process input definition. It's really difficult for me to code since there are only very few examples.




deliverableDir = 'deliverables/' + workflow.scriptName.replace('.nf','')


nGroups = 2

minGroupSize = 10

maxGroupSize = 20


process build {

  cache false

  output:

    file 'jars_hash' into jars_hash

    file 'classpath' into classpath


  """

  set -e

  current_dir=`pwd`

  cd ../../../..

  ./gradlew build

  ./gradlew printClasspath | grep CLASSPATH-ENTRY | sort | sed 's/CLASSPATH[-]ENTRY //' > \$current_dir/temp_classpath

  for file in `ls build/libs/*jar`

  do

    echo `pwd`/\$file >> \$current_dir/temp_classpath

  done

  cd -

  touch jars_hash

  for jar_file in `cat temp_classpath`

  do

    shasum \$jar_file >> jars_hash

  done

  cat temp_classpath | paste -sd ":" - > classpath

  """

}


jars_hash.into {

  jars_hash1

  jars_hash2

}

classpath.into {

  classpath1

  classpath2

}

process generateData {

  cache 'deep'

  input:

    each i from minGroupSize..maxGroupSize

    file classpath1

    file jars_hash1

  output:

    file 'generated$i' into data


  """

  set -e

  java -cp `cat classpath` -Xmx2g matchings.PermutedClustering \

    --experimentConfigs.managedExecutionFolder false \

    --experimentConfigs.saveStandardStreams false \

    --experimentConfigs.recordExecutionInfo false \

    --experimentConfigs.recordGitInfo false \

    --model.nGroups $nGroups \

    --model.groupSize $i \

    --engine Forward

  mv samples generated$i

  """

}


process runInference {

  cache 'deep'

  input:

    each i from minGroupSize..maxGroupSize

    file data.collect()

    file classpath2

    file jars_hash2

  output:

    file 'samples' into samples


  """

  set -e

  tail -n +2 generated${i}/observations.csv | awk -F "," '{print \$2, ",", \$3, ",", \$4}' | sed 's/ //g' > data.csv

  java -cp `cat classpath` -Xmx2g matchings.PermutedClustering \

    --initRandom 123 \

    --experimentConfigs.managedExecutionFolder false \

    --experimentConfigs.saveStandardStreams false \

    --experimentConfigs.recordExecutionInfo false \

    --experimentConfigs.recordGitInfo false \

    --model.nGroups $nGroups \

    --model.groupSize $i \

    --model.observations.file data.csv \

    --engine PT \

    --engine.nScans 2_000 \

    --engine.nThreads MAX \

    --engine.nChains 8

  done

  mv samples generated$i

  """

}

Paolo Di Tommaso

unread,
Jun 8, 2018, 5:09:58 AM6/8/18
to nextflow
Please include the complete error message produced by NF. 

p

--
You received this message because you are subscribed to the Google Groups "Nextflow" group.
To unsubscribe from this group and stop receiving emails from it, send an email to nextflow+unsubscribe@googlegroups.com.
Visit this group at https://groups.google.com/group/nextflow.
For more options, visit https://groups.google.com/d/optout.

Peiyuan Zhu

unread,
Jun 8, 2018, 5:42:59 AM6/8/18
to next...@googlegroups.com
 ERROR ~ Invalid process input definition on line process runInference. This is the error message.

Paolo Di Tommaso

unread,
Jun 8, 2018, 6:50:27 AM6/8/18
to nextflow
I guess the problem is the input definition `file data.collect()` that should be instead `file foo from data.collect()`


p
Reply all
Reply to author
Forward
0 new messages