Composition based features

46 views
Skip to first unread message

debasis.se...@gmail.com

unread,
Apr 5, 2019, 9:54:32 AM4/5/19
to matminer
Hi
I have gone through the tutorial in the following link


And it worked fine. Now I am trying to generate other composition based features. However, I have difficulty in following the what needs to passed as argument in featurize_dataframe other than "df".

For example

>>> from matminer.featurizers.composition import AtomicOrbitals
>>> ao_feat = AtomicOrbitals()
>>> ao_feat.featurize_dataframe(df,"composition")

The above did not work (I used the data in the above link and followed the steps upto generating oxidation states)

I have tried a couple of other composition based featurizer, but haven't been able to run them successfully.  The documentation is perhaps very good for users with advanced knowledge of python class, but not quite easy to follow for me.  For example, the following did not work either:

>>> from matminer.featurizers.composition import ElectronAffinity
>>> ea_feat =  ElectronAffinity()
>>> ea_feat.featurize_dataframe(df,"composition")

An example for each of these feature generation would be really helpful for user with no so advanced knowledge of python.

Any help is really appreciated on how to generate those composition based features.

Thanks
Debasis


Logan Ward

unread,
Apr 5, 2019, 10:37:01 AM4/5/19
to debasis.se...@gmail.com, matminer

Hello Debasis,

 

For each of these featurizer operations, the “featurize_dataframe” operation takes the dataframe “df” and the name of the column of that dataset (Pandas dataframes represent tables with rows and columns).

 

The error that I think might be happening here is that you did generate the composition with oxidation states, but are not calling “featurize_dataframe” with the column of the dataframe that contains the oxidized compositions (I think that defaults to “oxid_composition”).

 

One that that would help us debug your problem (now and in general), would be to paste the error that you are receiving. Generally, the last few line of the error message contains the information that is most relevant to us. Python returns that type of the error after telling you exactly where the program was (in terms of which function called by whichever other function). So, if you tell us the type of error and the last line in that “stack trace,” we can help you at lot easier.

 

Also, thanks for letting us know the docs were unclear for those without much Python experience. We’ll figure out how to revise them.

 

Best,

Logan

--
You received this message because you are subscribed to the Google Groups "matminer" group.
To unsubscribe from this group and stop receiving emails from it, send an email to matminer+u...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

 

debasis.se...@gmail.com

unread,
Apr 5, 2019, 11:02:37 AM4/5/19
to matminer
Hello Logan,

It seems that column name is "composition_oxid". That is what df.columns prints.
In fact I used "composition_oxid" column for generating atomic orbital and electron affinity features.  Following is the last few lines of errors. In addition, it also gives a warning at the beginning:
/global/homes/d/dxs/.local/cori/3.6-anaconda-4.4/lib/python3.6/site-packages/matminer/featurizers/composition.py:390: UserWarning: AtomicOrbitals: Al3+1 Co2+1 Co3+1 Si4-2 truncated to Al(CoSi)2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/global/homes/d/dxs/.local/cori/3.6-anaconda-4.4/lib/python3.6/site-packages/matminer/featurizers/base.py", line 340, in featurize_dataframe
    pbar=pbar)
  File "/global/homes/d/dxs/.local/cori/3.6-anaconda-4.4/lib/python3.6/site-packages/matminer/featurizers/base.py", line 467, in featurize_many
    return p.map(func, entries, chunksize=self.chunksize)
  File "/usr/common/software/python/3.6-anaconda-4.4/lib/python3.6/multiprocessing/pool.py", line 266, in map
    return self._map_async(func, iterable, mapstar, chunksize).get()
  File "/usr/common/software/python/3.6-anaconda-4.4/lib/python3.6/multiprocessing/pool.py", line 644, in get
    raise self._value
TypeError: 'NoneType' object is not subscriptable
To skip errors when featurizing specific compounds, consider running the batch featurize() operation (e.g., featurize_many(), featurize_dataframe(), etc.) with ignore_errors=True

To unsubscribe from this group and stop receiving emails from it, send an email to matm...@googlegroups.com.

debasis.se...@gmail.com

unread,
Apr 10, 2019, 9:59:24 AM4/10/19
to matminer
Hello Logan,
Would you be able to see what mistake I am doing for computing atomic orbital.  I have added the error (in my last message).
Thanks
Debasis

Logan Ward

unread,
Apr 10, 2019, 10:21:50 AM4/10/19
to debasis.se...@gmail.com, matminer

I’m not sure what’s happening for this one.

 

Could you add “ao_feat.set_n_jobs(1)” after “ao_feat = AtomicOrbitals()” and re-run to get a clearer error message (multiprocessing tends to eat the most useful error messages)?

 

Logan

To unsubscribe from this group and stop receiving emails from it, send an email to matminer+u...@googlegroups.com.

debasis.se...@gmail.com

unread,
Apr 11, 2019, 5:08:32 PM4/11/19
to matminer
Hi Logan,
I did as you recommended. Here is the script followed by the errors. Note that composition_oxid column has been generated before this following the tutorial.
>>> from matminer.featurizers.composition import AtomicOrbitals
>>> ao_feat = AtomicOrbitals()
>>> ao_feat.set_n_jobs(1)
>>> ao_feat.featurize_dataframe(df,"composition_oxid")

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/base.py", line 337, in featurize_dataframe
    features = self.featurize_many(df[col_id].values,
  File "/home/dxs/.local/lib/python3.6/site-packages/pandas/core/frame.py", line 2934, in __getitem__
    raise_missing=True)
  File "/home/dxs/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1354, in _convert_to_indexer
    return self._get_listlike_indexer(obj, axis, **kwargs)[1]
  File "/home/dxs/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1161, in _get_listlike_indexer
    raise_missing=raise_missing)
  File "/home/dxs/.local/lib/python3.6/site-packages/pandas/core/indexing.py", line 1246, in _validate_read_indexer
    key=key, axis=self.obj._get_axis_name(axis)))
KeyError: "None of [Index(['composition_oxid'], dtype='object')] are in the [columns]"
>>>

debasis.se...@gmail.com

unread,
Apr 11, 2019, 5:22:24 PM4/11/19
to matminer
Logan,
Ignore my immediate previous message (I forgot to generate "composition_oxid" before running AtomicOribitals).  Following is the new error.
>>> from matminer.featurizers.composition import AtomicOrbitals
>>> ao_feat = AtomicOrbitals()
>>> ao_feat.set_n_jobs(1)
>>> ao_feat.featurize_dataframe(df,"composition_oxid")
AtomicOrbitals:   0%|                                                                                                                        | 0/1181 [00:00<?, ?it/s]/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/composition.py:390: UserWarning:


AtomicOrbitals: Al3+1 Co2+1 Co3+1 Si4-2 truncated to Al(CoSi)2

/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/composition.py:390: UserWarning:

AtomicOrbitals: Sb3+1 Pt2-5 Pt2+1 Pt5+1 truncated to SbPt7

/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/composition.py:390: UserWarning:

AtomicOrbitals: Sc2+1 Sc3+1 Al3+1 Si4-2 truncated to Sc2AlSi2


Traceback (most recent call last):
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/base.py", line 493, in featurize_wrapper
    return self.featurize(*x)
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/composition.py", line 396, in featurize
    feat['{}_character'.format(edge)] = homo_lumo[edge][1][-1]

TypeError: 'NoneType' object is not subscriptable

During handling of the above exception, another exception occurred:


Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/base.py", line 340, in featurize_dataframe
    pbar=pbar)
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/base.py", line 450, in featurize_many
    return_errors=return_errors) for x in entries]
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/base.py", line 450, in <listcomp>
    return_errors=return_errors) for x in entries]
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/base.py", line 508, in featurize_wrapper
    reraise(type(e), type(e)(msg), sys.exc_info()[2])
  File "/usr/local/lib/python3.6/site-packages/six.py", line 692, in reraise
    raise value.with_traceback(tb)
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/base.py", line 493, in featurize_wrapper
    return self.featurize(*x)
  File "/home/dxs/.local/lib/python3.6/site-packages/matminer/featurizers/composition.py", line 396, in featurize
    feat['{}_character'.format(edge)] = homo_lumo[edge][1][-1]

TypeError: 'NoneType' object is not subscriptable
To skip errors when featurizing specific compounds, consider running the batch featurize() operation (e.g., featurize_many(), featurize_dataframe(), etc.) with ignore_errors=True

Reply all
Reply to author
Forward
0 new messages