Error during strainphlan

47 views
Skip to first unread message

Megan Folkerts

unread,
Dec 19, 2019, 3:39:27 PM12/19/19
to MetaPhlAn-users
Hello again,

I'm getting the following error when running strainphlan.py:

2019-12-19 13:12:01,532 | INFO | __main__ | strainer | 1353 | Load mpa_pkl
2019-12-19 13:12:50,720 | INFO | __main__ | strainer | 1369 | Get clades from db
2019-12-19 13:12:56,611 | INFO | __main__ | strainer | 1413 | Get clades from samples
2019-12-19 13:12:56,909 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/105.markers
2019-12-19 13:12:56,909 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/113.markers
2019-12-19 13:12:56,909 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/11.markers
2019-12-19 13:12:56,909 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/10.markers
2019-12-19 13:12:56,909 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/19.markers
2019-12-19 13:12:56,910 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/202.markers
2019-12-19 13:12:56,911 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/205.markers
2019-12-19 13:12:56,914 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/208.markers
2019-12-19 13:12:59,586 | DEBUG | __main__ | load_sample | 1142 | load samplemarkersout/211.markers
Traceback (most recent call last):
  File "/scratch/mfolkerts/bin/metaphlan/strainphlan.py", line 1570, in <module>
    strainphlan()
  File "/scratch/mfolkerts/bin/metaphlan/strainphlan.py", line 1566, in strainphlan
    strainer(args)
  File "/scratch/mfolkerts/bin/metaphlan/strainphlan.py", line 1417, in strainer
    kept_markers=kept_markers)
  File "/scratch/mfolkerts/bin/metaphlan/strainphlan.py", line 1261, in load_all_samples
    use_threads=args['use_threads'])
  File "/scratch/mfolkerts/bin/metaphlan/strainphlan_src/ooSubprocess.py", line 258, in parallelize
    results = pool.map(func, args)
  File "/home/mfolkerts/miniconda3/envs/python2.7/lib/python2.7/multiprocessing/pool.py", line 253, in map
    return self.map_async(func, iterable, chunksize).get()
  File "/home/mfolkerts/miniconda3/envs/python2.7/lib/python2.7/multiprocessing/pool.py", line 572, in get
    raise self._value
Exception: Traceback (most recent call last):
  File "/scratch/mfolkerts/bin/metaphlan/strainphlan_src/ooSubprocess.py", line 244, in wrapper
    return f(*args, **kwargs)
  File "/scratch/mfolkerts/bin/metaphlan/strainphlan.py", line 1206, in load_sample
    clade = db['markers'][marker]['taxon'].split('|')[-1]
KeyError: '227940__GeneID:1260616'

Here is the code I'm running:

strainphlan.py --ifn_samples samplemarkersout/*.markers --output_dir . --print_clades_only > clades.txt --nprocs_main 8 --mpa_pkl /scratch/mfolkerts/bin/metaphlan_databases/mpa_v295_CHOCOPhlAn_201901.pkl

I've downloaded metaphlan2 from the bitbucket repository to bypass the current bug that prevents strainphlan from running properly when it's installed through conda. Sample2markers.py seems to have worked as expected(after a lot of finagling), and I've checked the marker files themselves; they seem comparable in format to those in the tutorial. Do you have any idea what would cause this?

Thanks!

Megan

Aitor Blanco-Miguez

unread,
Dec 20, 2019, 10:32:24 AM12/20/19
to MetaPhlAn-users
Hi Megan,
It seems that there is a problem with some markers from viruses. 
While we solve this issue, you can execute StrainPhlAn using the mpa_v294_CHOCOPhlAn_201901.pkl MetaPhlAn2.9 database. 
It is the same database as the v295 but without the markers for the viruses.

I will contact with you again after when issue with the v295 database is solved.

Best,
Aitor

Megan Folkerts

unread,
Dec 20, 2019, 11:41:35 AM12/20/19
to Aitor Blanco-Miguez, MetaPhlAn-users
Hi Aitor,

Thanks for getting back to me. Unfortunately, this seems to be database independent. I've tried multiple versions of chocophlan, as well as the old mpa_v20 database, and I get the same error every time (though the offending GeneID changes). I'll keep troubleshooting on my end; maybe sample2markers didn't run as it should have.

--
You received this message because you are subscribed to the Google Groups "MetaPhlAn-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to metaphlan-use...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/metaphlan-users/c1dfdd25-5385-4aac-a7c6-4015820fbfe4%40googlegroups.com.


--
Megan Folkerts, MS
Research Associate II
Center for Emerging Pathogens and Technologies
TGen North
Flagstaff, Arizona
928-226-6375

This electronic message is intended to be for the use only of the named recipient, and may contain information that is confidential or privileged, including patient health information. If you are not the intended recipient, you are hereby notified that any disclosure, copying, distribution or use of the contents of this message is strictly prohibited. If you have received this message in error or are not the named recipient, please notify us immediately by contacting the sender at the electronic mail address noted above, and delete and destroy all copies of this message. Thank you.

Megan Folkerts

unread,
Dec 20, 2019, 12:12:34 PM12/20/19
to Aitor Blanco-Miguez, MetaPhlAn-users
Nevermind on this. It seems the issue was that when I downloaded the repository from bitbucket, the version of strainphlan.py that downloaded was an older one. I updated versions and everything ran smoothly. 
Reply all
Reply to author
Forward
0 new messages