Compound search in GNPS

257 views
Skip to first unread message

Yaroslav Lyutvinskiy

unread,
Jul 28, 2017, 1:24:51 PM7/28/17
to GNPS Discussion Forum and Bug Reports
Sorry, may be I missed something, but is there ability to search over GNPS for certain compound by SMILE, InChi or at least by Name? I tried to understand if there are any spectra of penicillin G in libraries and have not found the way.

With respect,
Yaroslavf

Pieter Dorrestein

unread,
Jul 28, 2017, 5:50:49 PM7/28/17
to GNPS Discussion Forum and Bug Reports
For the name yes but not inchi or smiles-for that you have to download the library at this time. To search for the name you can click on the library you want to search in http://gnps.ucsd.edu/ProteoSAFe/libraries.jsp. Then add the name and hit filter on the left hand side. You have to search for the names for the following 226 names of synonyms as well in pubchem see link https://pubchem.ncbi.nlm.nih.gov/compound/penicillin_g#section=Depositor-Supplied-Synonyms

However because the complexity is that there are so many different names that are used that it is better to search by mass (including the adducts such as H+, K+ and Na+ if positive mode) and then filter. If you want to know if it has been seen before in any public data set then use molecular explorer.

P

Yaroslav Lyutvinskiy

unread,
Jul 29, 2017, 6:37:29 PM7/29/17
to GNPS Discussion Forum and Bug Reports
Thank you, Piter.

I see the point, That is an option. However I found it a bit inconvenient to search this way through all the libraries. That is very common problem with chemical naming and, unfort, I see that not all of the interpreted spectra completed with SMILES or InChi. I understand that there no simple solution on it. Let's hope hat in future using Inchi or smiles will be more common..

Yaroslav

Pieter Dorrestein

unread,
Jul 31, 2017, 12:53:59 PM7/31/17
to GNPS Discussion Forum and Bug Reports
Agree it is not the most convenient, and improving such basic search capabilities is on our list to do as I agree it is important. Our first goal with GNPS was to enable searches with spectra against all public available libraries with large data sets rather than manual filtering or searching for one spectrum at a time. MONA and Metlin have such capacity and you might want to look there as well. In the end all data and information should be easily searchable with any info akin to a google style search. We are aware of this and that this is not yet done well within GNPS. At this time you can download the data set and if the molecule has an Inchi and SMILES associated with it you can search it within the downloaded file.

There are two things (among the many other features we would like such as enabling GC-MS living data and crowdsourced analysis) we would like to enable within the GNPS platform 1) search all libraries by parent mass in one search. 2) enable all other data types to be searched including Inchi and SMILES. There will be a limitation there with many spectra that the community annotated such as dimers, multimers, where the structure is partially known, chimeric MS/MS and MS/MS spectra associated with clusters such as sodium formate clusters cannot be defined using Inchi or SMILES, and not all spectra of named molecules have Inchi or Smiles and as such you search a limited number and therefore parent masses should be a key way to search first. 

Yaroslav Lyutvinskiy

unread,
Aug 7, 2017, 7:21:37 AM8/7/17
to GNPS Discussion Forum and Bug Reports
Convenient automated search of big data sets is a very big deal and a great deficiency of metabolomics data processing, I totally agree! I see an interesting abilities with search of parent ion mass. As you have identification on MSMS spectra then you can calculate true mass of ion from atomic mass even if parent mass has been measured with notable error (like from QQQ). Then, if you will allow search by that calculated mass rather then by measured, it comes to be easier to catch right ID with more accurate mass on the hand (like orbitrap).
What of very basic features I would see in GNPS, that is separate Ion Mass Tolerance parameter for library search. The deal is that we have our own data from QExactive and low mass tolerance (0.01 Da, or, better 10 ppm ) would be appropriate to process this data with clustering and networking. However, there is a lot of perfect spectra in library obtained at low mass accuracy instruments like QQQ or ion traps. To make search on this libraries we have to set relaxed mass tolerance like 0.5 Da. I end up with separate run for searches with no clustering. After that I merge information about clustering with my own local script.
I clearly understand you passion to make interface as simple as possible, including minimum set of parameters,however for me it looks like a kind of oversimplification. May be it make sense to introduce such a parameter, say,  in Advanced Library search option section?
I think I would replicate this message also to Feature Request Forum.

Najla Moufarreg

unread,
Sep 29, 2017, 3:18:45 PM9/29/17
to GNPS Discussion Forum and Bug Reports
Hi everyone,

I would like to reinforce this opinion, I would like to see this option separated too, m/z difference for database search and for clustering.... GNPS is already amazing, congrats and thanks Pieter
Reply all
Reply to author
Forward
0 new messages