Hi Nathan,
I installed GRIT-1.1.3 on our cluster (CentOS 6.1, python 2.7.2).
I tested the program using the example file (GRIT_example.tar) with following command:
run_grit.py --control AdMatedF_Ecl_20days_Heads.control.txt --reference flybase-r5.45.chr4.gtf --ucsc -t 8 -b
The first step (building 'elements') went through smoothly, and successfully generated the output file 'discovered.AdMatedF_Ecl_20days_Heads.elements.bed'. However, a series of errors occur afterwards (during the transcript assembly step in my understanding). These includes errors in some 3rd party module like 'multiprocessing' as well as in the GRIT 'build_transcripts.py' script, such as:
Spawning new worker child
Building transcript and ORFs for Gene chr4_m_36
Finding design matrix for Gene chr4_m_38(chr4:-:1225902-1234438) - 14 transcripts
Process Process-59:
FINISHED Building transcript and ORFs for Gene chr4_m_37
Building transcript and ORFs for Gene chr4_m_35
Spawning new worker child
Traceback (most recent call last):
FINISHED Building transcript and ORFs for Gene chr4_m_36
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
Finding design matrix for Gene chr4_m_37
FINISHED Building transcript and ORFs for Gene chr4_m_35
Spawning new worker child
Errors in the 'build_transcripts.py' script:
Traceback (most recent call last):
self.run()
self.run()
self._target(*self._args, **self._kwargs)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 114, in run
Waiting for free children
Waiting for write queue to fill.
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 114, in run
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
self._target(*self._args, **self._kwargs)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
self._target(*self._args, **self._kwargs)
self._target(*self._args, **self._kwargs)
self.run()
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
Waiting for free children
Waiting for write queue to fill.
Finally the program went though, and generated all 6 expected files:
-rw-r--r-- 1 134 Nov 17 13:23 discovered.AdMatedF_Ecl_20days_Heads.rep2.expression.csv
-rw-r--r-- 1 64 Nov 17 13:23 discovered.AdMatedF_Ecl_20days_Heads.rep2.gtf
-rw-r--r-- 1 138 Nov 17 13:23 discovered.log
-rw-r--r-- 1 134 Nov 17 13:22 discovered.AdMatedF_Ecl_20days_Heads.rep1.expression.csv
-rw-r--r-- 1 64 Nov 17 13:22 discovered.AdMatedF_Ecl_20days_Heads.rep1.gtf
-rw-r--r-- 1 277K Nov 17 13:22 discovered.AdMatedF_Ecl_20days_Heads.elements.bed
but only the 'elements' file (discovered.AdMatedF_Ecl_20days_Heads.elements.bed) contains some content, while the others only contain header lines.
Then I turned off parallel processing function (-t option) so to bypass errors caused by 3rd party module 'multiprocessing'. However, errors still exist when the 'transcript building' step began. Here's the screen printing for the 'transcript building' step (after finishing building 'elements'):
Loading discovered.AdMatedF_Ecl_20days_Heads.elements.bed
Finished Loading discovered.AdMatedF_Ecl_20days_Heads.elements.bed
Estimating the fragment length distribution
Finished estimating the fragment length distribution
Initializing processing data
Clustering elements into genes for 4:-
Clustering elements into genes for 4:+
Finished initializing processing data
Spawning new worker child
Initializing background writer
Waiting for write queue to fill.
Waiting for write queue to fill.
Building transcript and ORFs for Gene 4_p_34
FINISHED Building transcript and ORFs for Gene 4_p_34
Finding design matrix for Gene 4_p_34
Finding design matrix for Gene 4_p_34(4:+:1219475-1226882) - 2 transcripts
Waiting for write queue to fill.
Finished building transcripts
Traceback (most recent call last):
File "/hpcf/apps/python/install/2.7.2/bin/run_grit.py", line 5, in <module>
pkg_resources.run_script('GRIT==1.1.3', 'run_grit.py')
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/distribute-0.6.34-py2.7.egg/pkg_resources.py", line 505, in run_script
self.require(requires)[0].run_script(script_name, ns)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/distribute-0.6.34-py2.7.egg/pkg_resources.py", line 1245, in run_script
execfile(script_filename, namespace, namespace)
File "/sonas/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/EGG-INFO/scripts/run_grit.py", line 515, in <module>
main()
File "/sonas/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/EGG-INFO/scripts/run_grit.py", line 510, in main
estimate_confidence_bounds=args.estimate_confidence_bounds )
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 948, in build_and_quantify_transcripts
estimate_confidence_bounds)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 873, in spawn_and_manage_children
worker(*args)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 754, in worker
(rnaseq_reads, promoter_reads, polya_reads) )
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 472, in build_design_matrices_worker
config.MAX_NUM_TRANSCRIPTS)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/f_matrix.py", line 834, in __init__
self._build_rnaseq_arrays(gene, rnaseq_reads, fl_dists)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/f_matrix.py", line 693, in _build_rnaseq_arrays
expected_rnaseq_array, observed_rnaseq_array)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/f_matrix.py", line 600, in cluster_rows
(len(clusters), expected_rnaseq_array.shape[1]) )
TypeError: object of type 'generator' has no len()
Process Process-4:
Traceback (most recent call last):
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/site-packages/GRIT-1.1.3-py2.7-linux-x86_64.egg/grit/build_transcripts.py", line 575, in write_finished_data_to_disk
write_type, key = data.finished_queue.get(timeout=0.1)
File "<string>", line 2, in get
File "/hpcf/apps/python/install/2.7.2/lib/python2.7/multiprocessing/managers.py", line 758, in _callmethod
conn.send((self._id, methodname, args, kwds))
IOError: [Errno 32] Broken pipe
This time, the program died in the middle.
It seems that similar errors were also observed by other users:
https://groups.google.com/forum/#!topic/grit-bio/-V52TvqG91o
Would you mind giving us any clues how we can fix this? Your help is greatly appreciated!
Best,
Yuxin