Re: Question on NWChem conversion steps

Skip to first unread message

Peter Murray-Rust

Nov 10, 2011, 5:23:34 PM11/10/11
to Jorge Estrada, Quixote mail list, DeJong, Wibe A
I am recopying this to the Quixote mail list. We (and I am as bad as anyone!) should be using these for everything except personal stuff. I sent a number of "URGENT" messages recently which should really be on the lists.

On Thu, Nov 10, 2011 at 1:47 PM, Jorge Estrada <> wrote:
Sam, Peter,

We are studying the uploading of NWChem files to chempound. We want to know exactly how to test the conversion standalone first.

My question is: when converting an NWChem output file into CML, I see you do it in 2 steps: first, call NWChemLog2XMLConverter, then call NWChemLogXML2XMLConverter. What tasks do each converter perform?
Effectively the first converter extracts the information on a chunk-by-chunk or line-by-line or even field-by-field basis. The result is CML, but it normally does not conform to any convention. For example a solid state researcher might want a CMLComp document, while a chemist might want the compchem one. The first parse is therefore "raw". As an example we want to put the host name into the compchem:environment module. The name might occur at the start of the output in some codes and at the end in others. The second parse gathers the information anf transforms/filters/sorts it to be consistent with a convention.
I have discussed this at length with Bert (copied - BTW Bert I assume you are on the lists). He believes that when we FoXify NWChem it will be possible to output convention-compliant output directly. If so, it will need to be able to manage mutliple conventions, and I suspect will still have to be generic, though close.
So in essence:
 * FooLog2FooXMLConverter produces FooCML which roughly maps onto the output.
 * FooXML2CMLConverter transforms FooCML to convention-compliant CML> (Which suggessts to me that there will be different transformers for every convention

Since later we would like to extract additional information from the uploaded CMLs to be shown by Chempound, which class deals with the selection of the fields to show in the Chempound page for each CML file?
That's an excellent question. I hope sam can answer it in a few seconds. Otherwise it should be possible to explore the code. 
What should we do if we want to add a new simple CML element (say one which has a single double value) to the Chempound page for an NWChem CML file?
My guess is that Sam uses a Freemarker Template for this (this is what he has suggested for the input). IN which case we need to agree on a data model for Freemarker. 


Delighted to see the discussion.


BTW many thanks for the help from sam and Jorge for our demo. Went very well. 

Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
Reply all
Reply to author
0 new messages