Predict Molecular Formulae from GNPS output (or batch prediction from collection of MS files)?

54 views
Skip to first unread message

Michael

unread,
Jun 16, 2020, 12:48:12 PM6/16/20
to GNPS Discussion Forum and Bug Reports
Hello,

Anyone know of a tool to batch predict molecular formulae from a large set of HRMS RAW, mzxml, or MGF files? Maybe allowing atom number constraints and outputting the top 2 or 3 confidence predictions? Even better if it took data from a molecular network as input to reduce data points (since multiple scans get combined into single nodes). SIRIUS can do this with a set of MGF files but it's PAINFULLY slow and often gets stuck in my experience. I've been looking around to find something else but having trouble finding anything aside from non-open source tools Metaboscape. Thanks!

Mingxun Wang

unread,
Jun 16, 2020, 12:50:23 PM6/16/20
to GNPS Discussion Forum and Bug Reports
Hey Michael,

So we do this with Sirius in a high throughput manner in GNPS. Its definitely a work in progress, but we try to take mzML files and do exactly what you're talking about. If you want to be a part of the effort to find bugs and make it better let me know!

Ming

Aksenov Alexander

unread,
Jun 16, 2020, 12:52:53 PM6/16/20
to GNPS Discussion Forum and Bug Reports
Michael -
Sirius would be the best tool for that. To make it faster you could delete features from your mgf with parent mass exceeding ~700 Da, this is where Sirius gets stuck.


On Tuesday, June 16, 2020 at 9:48:12 AM UTC-7, Michael wrote:

stefano papazian

unread,
Jun 16, 2020, 1:07:37 PM6/16/20
to Michael, GNPS Discussion Forum and Bug Reports
I have been using the latest version of the SIRIUS/CANOPUS workflow in batch via command line to process directly mzML files. Works much faster then the GUI, and using their own peak-picking method produced good isotope scores (compared to using pre-processed mgf files). I have not tried the version integrated in GNPS but I am looking forward to test it! 


On Tue, Jun 16, 2020, 18:48 Michael <mrmikem...@gmail.com> wrote:
Hello,

Anyone know of a tool to batch predict molecular formulae from a large set of HRMS RAW, mzxml, or MGF files? Maybe allowing atom number constraints and outputting the top 2 or 3 confidence predictions? Even better if it took data from a molecular network as input to reduce data points (since multiple scans get combined into single nodes). SIRIUS can do this with a set of MGF files but it's PAINFULLY slow and often gets stuck in my experience. I've been looking around to find something else but having trouble finding anything aside from non-open source tools Metaboscape. Thanks!

--
You received this message because you are subscribed to the Google Groups "GNPS Discussion Forum and Bug Reports" group.
To unsubscribe from this group and stop receiving emails from it, send an email to molecular_networking_b...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/molecular_networking_bug_reports/b9336ef4-97cc-44c8-a907-3ca4a06335d7o%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages