Failed to load target/decoys & other troubleshooting

38 views
Skip to first unread message

tlun...@nd.edu

unread,
Jun 24, 2019, 2:47:28 PM6/24/19
to GlycReSoft
After running through the tutorial for GlycReSoft, I've run into 3 issues trying to process my own data. Please forgive my ignorance, as I am not an experienced python/terminal user...
Thank you in advance for any help! --Taylor

I am confused why during the in-silica digest, there are XXX base peptides found, then very few peptides produced after applying protein annotations. For context, my input: 
$ glycresoft build-hypothesis glycopeptide-fa -g glycans.db -s hypothesis -G 1 -u 1 -e trypsin -m 1 -c "Carbamidomethyl (C)" -p 4 -n "MucMix Hyp" mucmix.fa fasta-mucmix.db

Results in:

13:12:18 - glycresoft:naive_glycope:135  - INFO - Digesting Proteins

13:12:18 - glycresoft:task         :295  - INFO - ...... Started digesting sp|Q02817|MUC2_HUMAN Mucin-2 (5179)

13:12:18 - glycresoft:task         :295  - INFO - ...... Started digesting sp|Q9HC84|MUC5B_HUMAN Mucin-5B (5762)

13:12:18 - glycresoft:task         :295  - INFO - ...... Started digesting sp|Q8WXI7|MUC16_HUMAN Mucin-16 (14507)

13:12:21 - glycresoft:task         :295  - INFO - ...... Finished digesting sp|Q02817|MUC2_HUMAN Mucin-2 (5179)

13:12:21 - glycresoft:task         :295  - INFO - ...... Started digesting sp|Q9UKN1|MUC12_HUMAN Mucin-12 (5478)

13:12:22 - glycresoft:task         :295  - INFO - ...... Finished digesting sp|Q9HC84|MUC5B_HUMAN Mucin-5B (5762)

13:12:23 - glycresoft:task         :295  - INFO - ...... Finished digesting sp|Q9UKN1|MUC12_HUMAN Mucin-12 (5478)

13:12:38 - glycresoft:task         :295  - INFO - ...... Finished digesting sp|Q8WXI7|MUC16_HUMAN Mucin-16 (14507)

13:12:38 - glycresoft:task         :295  - INFO - ...... Started digesting sp|P98088|MUC5A_HUMAN Mucin-5AC (5654)

13:12:41 - glycresoft:task         :295  - INFO - ...... Finished digesting sp|P98088|MUC5A_HUMAN Mucin-5AC (5654)

13:12:47 - glycresoft:naive_glycope:177  - INFO - 2618 Base Peptides Produced

13:12:47 - glycresoft:peptide_permu:586  - INFO - Begin Applying Protein Annotations

13:12:48 - glycresoft:peptide_permu:610  - INFO - ... 20.000% Complete (1/5). 0 Peptides Produced.

13:12:48 - glycresoft:peptide_permu:610  - INFO - ... 40.000% Complete (2/5). 3 Peptides Produced.

13:12:48 - glycresoft:peptide_permu:610  - INFO - ... 60.000% Complete (3/5). 7 Peptides Produced.

13:12:48 - glycresoft:peptide_permu:610  - INFO - ... 80.000% Complete (4/5). 10 Peptides Produced.

13:12:49 - glycresoft:peptide_permu:610  - INFO - ... 100.000% Complete (5/5). 14 Peptides Produced.

13:12:52 - glycresoft:peptide_permu:622  - INFO - Threaded Queue Closed

13:12:55 - glycresoft:peptide_permu:622  - INFO - Threaded Queue Closed

13:12:55 - glycresoft:remove_duplic:42   - INFO - ... Extracting Best Peptides

13:12:55 - glycresoft:remove_duplic:46   - INFO - ... Building Mask

13:12:55 - glycresoft:remove_duplic:51   - INFO - ... Removing Duplicates

13:12:55 - glycresoft:remove_duplic:56   - INFO - ... Complete


Next, when I try to search the preprocessed data against the generated glycopeptide database using more than one process, I run into an "invalid argument" error immediately following the first batch processing (this has not been 100% consistent, but the error is much more common than not). I have never hit this error when i specify "-p 1".
Input:

$ glycresoft analyze search-glycopeptide -p 4 -o mucmixresults.db fasta-mucmix.db CA125-40282-P1_3.preprocessed.mzML 1

Results:

... 

13:17:49 - glycresoft:profiler     :731  - INFO - Loading MS/MS

13:17:50 - glycresoft:glycopeptide_:171  - INFO - Writing Matches To TempFileManager('/var/folders/34/rfrfd6s92r14_k8yb940k5_r0000gn/T/tmp9h0IgLtfosercylg')

13:17:50 - glycresoft:glycopeptide_:175  - INFO - ... Begin Batch

13:17:50 - glycresoft:glycopeptide_:175  - INFO - ... controllerType=0 controllerNumber=1 scan=22166: 8802.105341 (+5)

13:17:50 - glycresoft:glycopeptide_:175  - INFO - ... 500/16308 (3.066%)

13:17:52 - glycresoft:glycopeptide_:177  - INFO - ... 232 Unconfirmed Precursor Spectra

13:17:52 - glycresoft:glycopeptide_:178  - INFO - ... Spectra Extracted

13:17:52 - glycresoft:spectrum_eval:201  - INFO - ... Begin Collecting Hits

13:17:52 - glycresoft:spectrum_eval:203  - INFO - ... Mapping For Unmodified

13:17:52 - glycresoft:spectrum_eval:216  - INFO - ...... Mapping 25.21% of spectra (122/484) 6089.5485

13:17:52 - glycresoft:spectrum_eval:238  - INFO - ...... Mapping Segment Done. (31 spectrum-pairs)

13:17:53 - glycresoft:spectrum_eval:216  - INFO - ...... Mapping 50.83% of spectra (246/484) 5558.9013

13:17:53 - glycresoft:spectrum_eval:238  - INFO - ...... Mapping Segment Done. (0 spectrum-pairs)

13:17:53 - glycresoft:spectrum_eval:216  - INFO - ...... Mapping 75.21% of spectra (364/484) 5296.2316

13:17:53 - glycresoft:spectrum_eval:238  - INFO - ...... Mapping Segment Done. (100 spectrum-pairs)

13:17:54 - glycresoft:matcher      :178  - INFO - ... Batch 1 (14153/14153) 100.00%

Traceback (most recent call last):

  File "/anaconda2/envs/GlycReSoft/bin/glycresoft", line 11, in <module>

    load_entry_point('glycan-profiling==0.3.12', 'console_scripts', 'glycresoft')()

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/cli/__main__.py", line 44, in main

    base.cli.main(standalone_mode=True)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/click/core.py", line 717, in main

    rv = self.invoke(ctx)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/click/core.py", line 1137, in invoke

    return _process_result(sub_ctx.command.invoke(sub_ctx))

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/click/core.py", line 1137, in invoke

    return _process_result(sub_ctx.command.invoke(sub_ctx))

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/click/core.py", line 956, in invoke

    return ctx.invoke(self.callback, **ctx.params)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/click/core.py", line 555, in invoke

    return callback(*args, **kwargs)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/click/decorators.py", line 17, in new_func

    return f(get_current_context(), *args, **kwargs)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/cli/analyze.py", line 225, in search_glycopeptide

    gps, unassigned, target_decoy_set = analyzer.start()

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/task.py", line 302, in start

    out = self.run()

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/profiler.py", line 737, in run

    target_decoy_set = self.do_search(searcher)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/profiler.py", line 679, in do_search

    batch_size=self.spectrum_batch_size)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/tandem/glycopeptide/glycopeptide_matcher.py", line 183, in search

    simplify=simplify, *args, **kwargs)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/tandem/glycopeptide/matcher.py", line 233, in score_all

    simplify=simplify, *args, **kwargs)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/tandem/glycopeptide/matcher.py", line 180, in score_bunch

    batch, *args, **kwargs)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/tandem/spectrum_evaluation.py", line 293, in _evaluate_hit_groups

    batch, **kwargs)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/tandem/spectrum_evaluation.py", line 284, in _evaluate_hit_groups_multiple_processes

    mass_shift_map=self.mass_shift_map, solution_handler_type=handler_tp)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/tandem/evaluation_dispatch/process.py", line 104, in __init__

    self.input_queue = self._make_input_queue()

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/site-packages/glycan_profiling-0.3.12-py2.7-macosx-10.6-x86_64.egg/glycan_profiling/tandem/evaluation_dispatch/process.py", line 129, in _make_input_queue

    return JoinableQueue(int(1e5))

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/multiprocessing/__init__.py", line 225, in JoinableQueue

    return JoinableQueue(maxsize)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/multiprocessing/queues.py", line 298, in __init__

    Queue.__init__(self, maxsize)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/multiprocessing/queues.py", line 69, in __init__

    self._sem = BoundedSemaphore(maxsize)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/multiprocessing/synchronize.py", line 130, in __init__

    SemLock.__init__(self, SEMAPHORE, value, value)

  File "/anaconda2/envs/GlycReSoft/lib/python2.7/multiprocessing/synchronize.py", line 75, in __init__

    sl = self._semlock = _multiprocessing.SemLock(kind, value, maxvalue)

OSError: [Errno 22] Invalid argument



And last, when I did run search my sample with the glycopeptide database, I don't think it properly loaded the matched targets/decoys (bolded below), resulting in 0 identified glycopeptides. What might be causing this?
Input:
$ glycresoft analyze search-glycopeptide -p 1 -o mucmixresults.db fasta-mucmix.db CA125-40282-P1_3.preprocessed.mzML 1
Output, starting at last batch:
 13:21:47 - glycresoft:glycopeptide_:175  - INFO - ... Begin Batch

13:21:47 - glycresoft:glycopeptide_:175  - INFO - ... controllerType=0 controllerNumber=1 scan=4491: 895.458021 (+2)

13:21:47 - glycresoft:glycopeptide_:175  - INFO - ... 16308/16308 (100.000%)

13:21:48 - glycresoft:glycopeptide_:177  - INFO - ... 81 Unconfirmed Precursor Spectra

13:21:48 - glycresoft:glycopeptide_:178  - INFO - ... Spectra Extracted

13:21:48 - glycresoft:spectrum_eval:201  - INFO - ... Begin Collecting Hits

13:21:48 - glycresoft:spectrum_eval:203  - INFO - ... Mapping For Unmodified

13:21:48 - glycresoft:spectrum_eval:216  - INFO - ...... Mapping 30.77% of spectra (4/13) 880.4486

13:21:48 - glycresoft:spectrum_eval:238  - INFO - ...... Mapping Segment Done. (0 spectrum-pairs)

13:21:48 - glycresoft:spectrum_eval:216  - INFO - ...... Mapping 53.85% of spectra (7/13) 845.4760

13:21:48 - glycresoft:spectrum_eval:238  - INFO - ...... Mapping Segment Done. (0 spectrum-pairs)

13:21:48 - glycresoft:spectrum_eval:216  - INFO - ...... Mapping 76.92% of spectra (10/13) 835.5013

13:21:48 - glycresoft:spectrum_eval:238  - INFO - ...... Mapping Segment Done. (0 spectrum-pairs)

13:21:48 - glycresoft:glycopeptide_:184  - INFO - ... Spectra Searched

13:21:48 - glycresoft:glycopeptide_:189  - INFO - ...... Total Matches So Far: 240 Targets, 231 Decoys


13:21:48 - glycresoft:glycopeptide_:196  - INFO - Search Done

13:21:48 - glycresoft:glycopeptide_:201  - INFO - Reloading Spectrum Matches

13:21:48 - glycresoft:glycopeptide_:215  - INFO - Loaded 0/240 Targets (0%)

13:21:48 - glycresoft:glycopeptide_:220  - INFO - Loaded 0/231 Decoys (0%)

13:21:48 - glycresoft:glycopeptide_:226  - INFO - Running Target Decoy Analysis with 240 targets and 231 decoys

13:21:48 - glycresoft:profiler     :752  - INFO - 0 spectrum matches accepted

13:21:48 - glycresoft:profiler     :755  - INFO - Building and Mapping Chromatograms

13:21:48 - glycresoft:extract      :75   - INFO - ... Begin Extracting Chromatograms

13:22:18 - glycresoft:extract      :77   - INFO - ...... Aggregating Chromatograms

13:22:48 - glycresoft:extract      :56   - INFO - ... 19639 Chromatograms Extracted.

13:23:01 - glycresoft:glycopeptide_:254  - INFO - Mapping MS/MS Identifications onto Chromatograms

13:23:01 - glycresoft:glycopeptide_:255  - INFO - 16768 Chromatograms

13:23:02 - glycresoft:glycopeptide_:261  - INFO - Assigning Solutions

13:23:02 - glycresoft:chromatogram_:274  - INFO - ... 0/240 Solutions Handled (0.00%)

13:23:02 - glycresoft:glycopeptide_:263  - INFO - Distributing Orphan Spectrum Matches

13:23:02 - glycresoft:chromatogram_:285  - INFO - ... ScanTimeBundle(controllerType=0 controllerNumber=1 scan=13362, 0.7307, 42.7803) 0/167 Orphans Handled (0.00%)

13:23:02 - glycresoft:chromatogram_:285  - INFO - ... ScanTimeBundle(controllerType=0 controllerNumber=1 scan=17452, 0.5580, 54.4882) 100/167 Orphans Handled (59.88%)

13:23:02 - glycresoft:glycopeptide_:265  - INFO - Selecting Most Representative Matches

13:23:02 - glycresoft:profiler     :695  - INFO - Aggregating Assigned Entities

13:23:02 - glycresoft:chromatogram_:159  - INFO - Aggregating Common Entities: 16768 chromatograms

13:23:02 - glycresoft:chromatogram_:183  - INFO - After merging: 16768 chromatograms

13:23:02 - glycresoft:profiler     :762  - INFO - Scoring chromatograms

13:23:02 - glycresoft:profiler     :719  - INFO - Assigning consensus glycopeptides to spectrum clusters

13:23:02 - glycresoft:profiler     :767  - INFO - Saving solutions (0 identified glycopeptides)

13:23:02 - glycresoft:profiler     :894  - INFO - Saving Results To "mucmixresults.db"

mobiu...@gmail.com

unread,
Nov 10, 2019, 9:29:43 PM11/10/19
to GlycReSoft
This was addressed via email, because I didn't notice that the email came via the group and I did not reply-all. Has the issue been completely resolved?

Thank you,
Joshua Klein
Reply all
Reply to author
Forward
0 new messages