Error while using pyMolnetenhancer script

שחף רופא‎

unread,

Sep 14, 2023, 7:31:44 AM9/14/23

to GNPS Discussion Forum and Bug Reports

hello everyone :)

im using pyMolNetEnhancer repositor. here is the following link:

https://github.com/madeleineernst/pyMolNetEnhancer

,for making Classification to my FBMN . im using Jupyter Notebook as an enviroment area for Pyton workflow.

I submited, then downloaded and unzip perfectly the files from GNPS, but

unfortunatly im dealing with an error in the middle of the procsses while the functioning: unique_smiles()

TypeError: sequence item 0: expected str instance, float found

it looks like the script has problems whith combining float/int cloumns.

i'll be thankful is someone can contact me for help with this, also It's a matter of the fact that I'm a begginer in this field and these are my first days in a data analysis role in a computational metabolomics laboratory.
thank you in advance :) im attaching here the exteded issue:

out = unique_smiles(matches)

--------------------------------------------------------------------------TypeError                                 Traceback (most recent call last)

e:\Soliman's lab\git project\pyMolNetEnhancer-master\Example_notebooks\ChemicalClasses_2_Network_FeatureBased.ipynb Cell 11 line 1
----> 1 out = unique_smiles(matches)

File c:\Users\shach\AppData\Local\Programs\Python\Python39\lib\site-packages\pyMolNetEnhancer\molnetenhancer.py:157, in unique_smiles(matches)
    155         matches[index] = matches[index].rename(columns = {'Scan':'cluster.index'})
    156     if '#Scan#' in matches[index].columns:
--> 157         matches[index] = matches[index].groupby('#Scan#', as_index=False).agg(lambda x: ','.join(set(x.dropna())))
    158         matches[index] = matches[index].rename(columns = {'#Scan#':'cluster.index'})
    160 comb = reduce(lambda left,right: pd.merge(left,right,on='cluster.index', how = "outer"), matches)

File c:\Users\shach\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\groupby\generic.py:1492, in DataFrameGroupBy.aggregate(self, func, engine, engine_kwargs, *args, **kwargs)
   1490 gba = GroupByApply(self, [func], args=(), kwargs={})
   1491 try:
-> 1492     result = gba.agg()
   1494 except ValueError as err:
   1495     if "No objects to concatenate" not in str(err):

File c:\Users\shach\AppData\Local\Programs\Python\Python39\lib\site-packages\pandas\core\apply.py:178, in Apply.agg(self)
    175     return self.agg_dict_like()
    176 elif is_list_like(func):
    177     # we require a list, but not a 'str'
--> 178     return self.agg_list_like()
    180 if callable(func):...
--> 157         matches[index] = matches[index].groupby('#Scan#', as_index=False).agg(lambda x: ','.join(set(x.dropna())))
    158         matches[index] = matches[index].rename(columns = {'#Scan#':'cluster.index'})
    160 comb = reduce(lambda left,right: pd.merge(left,right,on='cluster.index', how = "outer"), matches)

TypeError: sequence item 0: expected str instance, float found

Andrea Gentile

unread,

Jan 9, 2024, 12:09:01 PM1/9/24

to GNPS Discussion Forum and Bug Reports

Hi,

did you manage to solve the problem? I am having the same.

Andrea

Justin van der Hooft

unread,

Jan 12, 2024, 3:05:50 PM1/12/24

to GNPS Discussion Forum and Bug Reports

Good evening Andrea, all,

Thanks for your (continued) interest in the MolNetEnhancer workflow. It looks like many of you are especially interested in the chemical compound classification part of the workflow, and indeed, this part is currently not functional. The reason is that external sites do not return any classification terms for the candidate structures that are present in the molecular network - this results in the "no matches" error. Such external dependencies are one of the risks of integrative tools such as MolNetEnhancer. You could inform with the ClassyFire team if they are planning to create a new API that returns precomputed classification terms for SMILES.

On the short term, there will be no replacement for the existing workflow; however, I hope that in due time we will be able to do so through alternative means.

In the mean time, you could set up your own database with precomputed annotated structures to retrieve the classifications from (and feed into MolNetEnhancer), or rely on tools such as CANOPUS to do the chemical compound classification.

If the chemical compound classification part of the MolNetEnhancer workflow gets updated, we will let the community know.

Best,

Justin

PS if you find a solution or set up your own classification term API, please do inform the community!

Reply all

Reply to author

Forward