A nextflow script exists. The command in the script section accepts an argument which uses a wildcard to accept an arbitrary number of files. In a genomic world, this will eventual exceed Linux's command line length which is the bug I would like to fix.
Since nextflow is written in java, it has no such limit. What if I write the files, one per line, into a file which I'll call toc_file for table of contents. Somthing like...
def tsv = ["foo.tsv", "bar.tsv", 'qaz.tsv']
def File toc = new File("toc-files-to-merge.txt")
toc.withWriter("UTF-8") { out ->
tsv.each{out.writeLine it}
}
Consider this nextflow script
process passFileNames {
def toc_file = "toc-files-to-merge.txt"
def tsv = ["foo.tsv", "bar.tsv", 'qaz.tsv']
def File toc = new File(toc_file)
toc.withWriter("UTF-8") { out ->
tsv.each{out.writeLine it}
}
"""
python /Users/jkern/wip/nxf-wildcards/acceptor.py ${toc_file} > file
"""
}
Then the script will read it the filenames from it. Here is acceptor.py
import argparse
import os
def read_toc(toc_file):
print(f"file: {toc_file}")
if os.path.exists(toc_file):
with open(toc_file) as fd:
for row in fd:
print(row)
else:
print("file not found")
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("toc_file")
args = parser.parse_args()
read_toc(args.toc_file)
the toc_file doesn't exist in the work directory as I had expected. Why? I do not see any error message. What is happening here?
-jk