Force fields and basis sets [was: FW: Problems creating valid instance of the CML3 schema]

5 views
Skip to first unread message

Peter Murray-Rust

unread,
Feb 11, 2011, 12:32:36 PM2/11/11
to Sebastian Breuers, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan
[Sebastian and some of us have been having a discussion about what CML dictionaries can support so we have brought it here. I have left some of the previous disussion but please cut it off in your reply]


On Fri, Feb 11, 2011 at 12:51 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:
Wow, there are a lot more people in the discussion included. :) I think they belong to the Quixote project?

And we should probably take the discussion there :-)

We are converting the PARM files (basically 1,2,3, 4 atom parameters) to CML. We are not converting the functional form (that will be a lookup)
Ok, I think I was confused that I found in the compchemDict.cml this list of basis set names, while you are referring to the parameters that describe a force field.

The dictionaries are being refactored this weekend. Joe and Sam have given us templates and I will probably go through the existing dictionaries and make them conform and also refactor the content.
 
The confusion arose from the fact that I think of basis sets as an important part of the model chemistry of QM and of force fields as the comparable element in MD.

So do I - but I haven't always
 
When I saw how you build the force field with all its parameters into the CML while the basis sets are just covered with names (in the version of CML I know up to now) it felt like a mismatch.

 The intention is that the basis sets will also expose all their details - in this case as EMSL - but there are standard values provided by PNL.

For force fields there are no authorities providing values so it's important to show the explicit values.

Did you also do it for basis sets?
 
No we intend to use  EMSL
But as I can see here you just used this dictionary form for basis sets as a start or alternative and will in further steps include the parameters from the EMSL into the CML.

The first pass was really to scan the literature for basis sets and include all those we found.
 
I think for a start I would introduce the key words for the force fields into our MD dictionaries, as to my opinion it fits quite well into the current development of the compchemDict.cml. Though I think it would be best to develop different dictionaries for the 3 domains.

3? domains? I have QM and FF/MD so I would have 2 domains. There would also be a general dictionary which was independent of method




The mdout is very early - 2 hours old - and it's not yet committed. Possibly tonight. It will be in https://bitbucket.org/wwmm/jumbo-converters/src/78a115ee8f64/jumbo-converters-compchem/src/test/resources/compchem/amber/mdout


Which QM dictionaries?
I was referring to the compchem.cml dictionary that I could download from the xml-cml.org page after instruction by Joe. There is for example a list of basis sets names that can be used to describe the basis set instead of including the description of the different factors in the basis set.

I think Quixote will refactor this and perhaps provide normalization for EMSL lookup
I hope and I think that for the next three month the option of using force fields with a keyword from a dictionary can coexist with your solution of using fully description of the force field.

IFF the keyword is accepted by the community then that is probably OK. Unfortunately different versions of the same program and certainly mutants of the program change the force field values but not the name. If FFs had version numbers I would be relaxed. But saying "MM2" or "AMBER94" is a fragile way of defining a set of values.

AMBER reads the values from the file so there has to be some way of translating "AMBER-PARM94" to a set of values. How do we do this so everyone has the same data?
 
 

Yes - this is very much part of our goal. There are several components to the input:
* coordinates
* parameterisation (e.g. what force fields, methods, basis sets)
* constraints (pressure, dielectric, etc.)
* control (what operations to carry out)
* machine and job dependent quantities (memory, cutoffs, etc)

We have a lot of the high-level design but are still working out the technical implementation
Maybe I could start off with a dictionary of the design we discussed in our project, adapt it to your dictionary schema and use it. I would use this as a solution in the next 3 month as I get the feeling you plan to fully refactor this part of CML to capture the aformentioned properties of an MD simulation. I can imagine it will last longer than the next few weeks and we have to design and create something working to the end of May.

"we" includes "everyone". All our discussions are public and we share our output. If you have an idea it's up to the community  to decide what happens. The only real criterion is that it has to be implementable - we do not build vapourware. Generally the person suggesting something ends up implementing it!

If you are interested I would send the dictionary/ies to you, when they are ready or at least on a good way to it, and keep you informed about our achievements. I would stay in the fashion of the compchemDict.cml although I have the impression you are planning a new or extended approach.

Is there a way to keep track of your latest developments? A list where you announce major or even minor changes?

Quixote is a good place and also we are revitalising the CML Blog (Joe?)
 
I will then use an xslt script to generate an input file for a specific application.
The idea is to use CML as general description that can then be converted to the specific input files for a computational chemistry application.

Exactly. And we'd be delighted for you to take a lead in this!
Kind regards,

Sebastian


P.
 
 
I thought the xslt parser would decide if they can translate a part of cml to a specific code.

I don't understand :-)
I hope the paragraph above could give you an idea of the workflow we want to approach with CML.


We expect the results of this to go onto the Quixote wiki fairly soon.

P.

Cheers,

Sebastian

 
I could not find any ff or vdw neither in the dictionaries Joe pointed me to nor in the xsd. Is there a dictionary that I could also extend by certain parameters like a force field name in the manner as the basis sets are defined in compchemDict.cml?

We are at an early stage in the development of CML for force fields and we'd love to have your input.
 
Or that I could extend by certain water models or coupling algorithms for the pressure and temperature, respectively?

If the models have unique identifiers these can go in the dictionary (e.g. TIP3 is a reasonable dictionary term).
Generally CML deals with:
* objects - very well
* relationships - medium
* processes - not very well
 
I think we could give you a lot of input and maybe we could also write and extend some dictionaries that can be integrated in your language stack.

It sounds like we are starting to get a nucleus of interested people - and that when this starts to take off . I have copied in some of the Quixotans.

Current state:
* have complete parser for AMBER.parm (94 and 99) input
* am looking at AMBER output log files (not trajectories).

Kind regards,

Sebastian


Am 03.02.2011 17:36, schrieb Peter Murray-Rust:


On Thu, Feb 3, 2011 at 4:13 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:


Am 03.02.2011 16:59, schrieb Peter Murray-Rust:


On Thu, Feb 3, 2011 at 3:48 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:
Hello, Peter,

if we use force fields in the description of a molecular dynamic job would the description of the parameters be necessary. I think it could be helpful if you want to define a new force field and for that you also have to define the functions that should be used to which the values of the atom combinations should be applied. But mostly the programs know how to deal with atoms having a certain atom type and a certain topology.

The particular *programs* may know, but we need things that transfer between programs and here the semantics have to independent of the program. IOW we can print them in a paper.


My idea was more that the xslt, i.e. the parser, should do the translation between the CML force field name entry and the related entry one should do for the specific program.

You mean that the dictionary contained the force field info and this was translated by the stylesheet. That may be possible. We are discussing dictionaries at the moment.

I am doing a simple example of AMBER input - that will make it clearer what we need to talk about.




--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069


-- 
_____________________________________________________________________________

Sebastian Breuers               Tel: +49-221-470-4108
EMail: breu...@uni-koeln.de    

Universität zu Köln             University of Cologne
Department für Chemie           Department of Chemistry
Organische Chemie               Organic Chemistry

Greinstraße 6                   Greinstraße 6
Raum 325			Room 325
D-50939 Köln                    D-50939 Cologne, Federal Rep. of Germany
_____________________________________________________________________________



--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069


-- 
_____________________________________________________________________________

Sebastian Breuers               Tel: +49-221-470-4108
EMail: breu...@uni-koeln.de    

Universität zu Köln             University of Cologne
Department für Chemie           Department of Chemistry
Organische Chemie               Organic Chemistry

Greinstraße 6                   Greinstraße 6
Raum 325			Room 325
D-50939 Köln                    D-50939 Cologne, Federal Rep. of Germany
_____________________________________________________________________________



--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069


-- 
_____________________________________________________________________________

Sebastian Breuers               Tel: +49-221-470-4108
EMail: breu...@uni-koeln.de

Universität zu Köln             University of Cologne
Department für Chemie           Department of Chemistry
Organische Chemie               Organic Chemistry

Greinstraße 6                   Greinstraße 6
Raum 325            Room 325
D-50939 Köln                    D-50939 Cologne, Federal Rep. of Germany
_____________________________________________________________________________ 



--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069


-- 
_____________________________________________________________________________

Sebastian Breuers               Tel: +49-221-470-4108
EMail: breu...@uni-koeln.de    

Universität zu Köln             University of Cologne
Department für Chemie           Department of Chemistry
Organische Chemie               Organic Chemistry

Greinstraße 6                   Greinstraße 6
Raum 325			Room 325
D-50939 Köln                    D-50939 Cologne, Federal Rep. of Germany
_____________________________________________________________________________



--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Peter Murray-Rust

unread,
Feb 14, 2011, 12:39:27 PM2/14/11
to Sebastian Breuers, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan


On Mon, Feb 14, 2011 at 3:04 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:
Hey,


The dictionaries are being refactored this weekend. Joe and Sam have given us templates and I will probably go through the existing dictionaries and make them conform and also refactor the content.

So is there a possibility to get access to these refactored versions also? Do I have to checkout CMLlite?

I am literally hacking the first dictionary as I speak. (ca 200 terms). The checking is quite strict so it takes a little time.

No you don't have to check out anything - the dictionary is in XML, and there is a stylesheet


For force fields there are no authorities providing values so it's important to show the explicit values.

Ok, I see. I can understand that we then would have problems to address particular force fields by just a force field name.
I'm very interested to know how to address the force fields in your parameterized way.

You can see the Amber one at  https://bitbucket.org/wwmm/jumbo-converters/src/733e85b50c20/jumbo-converters-compchem/src/test/resources/compchem/amber/in/ref/parm94.xml . this is the raw parse but it uses the tersm in the AM<BER documentation. I will be talking with Joe and Sam about how to make it more CML-like. But basically it defines atom types and the 1,2,3 and 4-tuples properties

3? domains? I have QM and FF/MD so I would have 2 domains. There would also be a general dictionary which was independent of method

We decided to also introduce and cover Docking as an additional domain for molecular simulation.

OK - I shall not be doing this as high priority (it's quite complex and it relies on the other components being solved)


IF the keyword is accepted by the community then that is probably OK. Unfortunately different versions of the same program and certainly mutants of the program change the force field values but not the name. If FFs had version numbers I would be relaxed. But saying "MM2" or "AMBER94" is a fragile way of defining a set of values.


AMBER reads the values from the file so there has to be some way of translating "AMBER-PARM94" to a set of values. How do we do this so everyone has the same data?

I was not aware that there are differences in a force field named by the same name.

There shouldn't be but people redistribute hacked files
 
Are you storing this to an XML/CML dialect converted force field parameters into a separate database file? As I got you right your way is to come from the output of a simulation, that was run with a certain force field and some definite parameters. From that you generate a CML file that holds also the information about the parameters of the force field and the results.

The force field is independent of the particular calculations, but not the reverse. We feel it's necessary to be able to indetify the parameters used in a calculation

Let's assume you come from somewhen before the simulation. Then I specify in some way a force field. To keep it easy I would have started with a force field keyword and maybe also with the program and version, since I now learned that there are differences in the force fields in different versions. Then to compose the CML (to fully describe my job setup) I would do a lookup to find the parameters for my force field.
 
If the lookup is robust, that's true, but it doesn't normally work that way. To give a QM example, I'm told that "B3LYP" in Gaussian is different from B3LYP in other programs.


How would I do this with the parameterized force fields in CML?

How would you look up? Someone has to take repsonsibility. In basis sets it seems to be PNL. For force-fields there is no-one. So perhaps we shall need to do something in Quixote or Blue Obelisk. (Note that Quixote's  current priority is QM, not FF)


Maybe I could start off with a dictionary of the design we discussed in our project, adapt it to your dictionary schema and use it. I would use this as a solution in the next 3 month as I get the feeling you plan to fully refactor this part of CML to capture the aformentioned properties of an MD simulation. I can imagine it will last longer than the next few weeks and we have to design and create something working to the end of May.

"we" includes "everyone". All our discussions are public and we share our output. If you have an idea it's up to the community  to decide what happens. The only real criterion is that it has to be implementable - we do not build vapourware. Generally the person suggesting something ends up implementing it!
I understand that we have to lead this discussion in public. It is also not my intention to create vapourware. I will do my best to generate a reasonable proposal.

Excellent.
 


Quixote is a good place and also we are revitalising the CML Blog (Joe?)
You mean to edit the wiki pages on the quixote project?

No - they can be edited anyway - there was a specific CML Blog.

P.

Peter Murray-Rust

unread,
Feb 15, 2011, 12:23:39 PM2/15/11
to Sebastian Breuers, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan

[Quixote colleagues - it may be useful to abstract some of this discussion for the wiki...]

On Tue, Feb 15, 2011 at 3:44 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:

So is there a possibility to get access to these refactored versions also? Do I have to checkout CMLlite?

I am literally hacking the first dictionary as I speak. (ca 200 terms). The checking is quite strict so it takes a little time.

No you don't have to check out anything - the dictionary is in XML, and there is a stylesheet
The question was more about where I can get it. To me this was quite abstract as I could not get a glimpse at your work and I only have the compChemDict.cml at hand.

I am doing them now :-).

The problem is that we have been building up experience over the years and although the dictionaries have been "valid CML" they haven't been consistent in their usage of the components. This is what we are now calling "conventions". There is a dictionary convention:
 
http://www.xml-cml.org/convention/dictionary

which specifies what is allowed and what is not allowed in a dictionary. I'm having to refactor dictionary by dictionary.  There are about 20 and they are in different stages of completeness.

I will make the CASTEP dictionary available, through Joe, hopefully this afternoon.

I will be working on the other dictionaries as I fly to Italy tomorrow. It's a good activity for airports as a lot of the stuff is tedious manual hack. Once it's done then it's trivial to convert to other formats.


Ok, I see. I can understand that we then would have problems to address particular force fields by just a force field name.
I'm very interested to know how to address the force fields in your parameterized way.

You can see the Amber one at  https://bitbucket.org/wwmm/jumbo-converters/src/733e85b50c20/jumbo-converters-compchem/src/test/resources/compchem/amber/in/ref/parm94.xml . this is the raw parse but it uses the tersm in the AM<BER documentation. I will be talking with Joe and Sam about how to make it more CML-like. But basically it defines atom types and the 1,2,3 and 4-tuples properties
Got it, saw it and understood that you are converting for different programs the force field configuration files into two files. A program specific dictionary file

Yes.
and a program specific CML database file that refers to entries in the specific dictionary file.
Yes. It needn't be in a database and most are probably littered around the Internet. We'd like to get them all collected in the Blue Obelisk. They aren't big.
 
And this combination can than be used in a structure description that in turn refers to the CML database file that contains the force field parameters.

Yes.

Dictionary + force-field + structure ==> program input


We decided to also introduce and cover Docking as an additional domain for molecular simulation.

OK - I shall not be doing this as high priority (it's quite complex and it relies on the other components being solved)
We will try to deal with that issue.


Let's assume you come from somewhen before the simulation. Then I specify in some way a force field. To keep it easy I would have started with a force field keyword and maybe also with the program and version, since I now learned that there are differences in the force fields in different versions. Then to compose the CML (to fully describe my job setup) I would do a lookup to find the parameters for my force field.
 
If the lookup is robust, that's true, but it doesn't normally work that way. To give a QM example, I'm told that "B3LYP" in Gaussian is different from B3LYP in other programs.

How would I do this with the parameterized force fields in CML?

How would you look up? Someone has to take repsonsibility. In basis sets it seems to be PNL. For force-fields there is no-one. So perhaps we shall need to do something in Quixote or Blue Obelisk. (Note that Quixote's  current priority is QM, not FF)
I think I did not express myself clear enough. I really want to use CML.

That's great!
 
This is why I described my setup above. As I am the one who is responsible for the development and usage this was the pratical question of 'How to do it?'. I think I got my answer with your example of the parm file, if the summary I wrote of your current work writing the amber dictionary is correct (cf. answer referring to your link). 

Yes. We have tended to do these things on an ad hoc basis - when there is a need we work on the specific subdomain. Since I have a colleagues working on Amber I thought I would see if I could solve that.

Still got the question how I include a dictionary into my CML? Meaning the real, actual xml string to refer to the dictionary itself. So that a 'dictRef' in my document would point to a real entry in a real file.

The idea is that

<cml xmlns:amber="http://www.xml-cml.org/dictionary/amber/" ... >

will be both an identifier AND an address. So the name defines the dictionary and the name also acts an address. This is TimBL's great idea of confalting names and addresses. It works when you have control over the server, fails when you migrate.

<property dictRef="amber:parm94" .../>

will point to a dictionary entry

http://www.xml-cml.org/dictionary/amber#parm94 ... >

A further one is if the CML dictionaries Joe pointed me to (by directly downloading them from the xml-cml.org) are all that are available (compchemDict.cml,  molecule.cml,  property.cml,  propertyG03.cml,  unitTypeDict.cml)?
You are currently refactoring the CML dictionaries. Is there a way to get access to them? To those changed one, not the one I already downloaded.

They will be posted on the xml-cml site very soon. Just the CASTEP one to start with. Expect the others over a few days.
 
If I would come up with some suggestions and maybe requirements to the CML or the dictionaries it would be reasonable to base these questions on top of those refactored dictionaries and not outdated ones (Just to comment and clarify the aim of my questions.).

Yes. Assume we are starting from scratch on most dictionaries
 

If it is not yet possible to get access to those refactored dictionaries can you estimate till when they are presentable to the public. On the basis of that we (MoSGrid) can decide if we can afford to wait or follow the old standards and the information basis I received and gathered up to now.

Anything you have gather on dictionaries will need to be refactored. Think of this as releasing CML Dictionary V0.1 in a week or two

This is all very exciting. Did you meet Christoph Steinbeck when he was at the Biozentrum in Koeln? He is now at EBI near Cambridge.

P.
 

Sebastian Breuers

unread,
Feb 15, 2011, 12:53:27 PM2/15/11
to Peter Murray-Rust, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan
Am 15.02.2011 18:23, schrieb Peter Murray-Rust:

[Quixote colleagues - it may be useful to abstract some of this discussion for the wiki...]

On Tue, Feb 15, 2011 at 3:44 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:

So is there a possibility to get access to these refactored versions also? Do I have to checkout CMLlite?

I am literally hacking the first dictionary as I speak. (ca 200 terms). The checking is quite strict so it takes a little time.

No you don't have to check out anything - the dictionary is in XML, and there is a stylesheet
The question was more about where I can get it. To me this was quite abstract as I could not get a glimpse at your work and I only have the compChemDict.cml at hand.

I am doing them now :-).

The problem is that we have been building up experience over the years and although the dictionaries have been "valid CML" they haven't been consistent in their usage of the components. This is what we are now calling "conventions". There is a dictionary convention:
 
http://www.xml-cml.org/convention/dictionary
Fantastic, this brings me a great deal forward. I will try to let my efforts on the MD dictionary follow this dictionary conventions.

which specifies what is allowed and what is not allowed in a dictionary. I'm having to refactor dictionary by dictionary.  There are about 20 and they are in different stages of completeness.

I will make the CASTEP dictionary available, through Joe, hopefully this afternoon.
Ok, I am really looking forward to.

I will be working on the other dictionaries as I fly to Italy tomorrow. It's a good activity for airports as a lot of the stuff is tedious manual hack. Once it's done then it's trivial to convert to other formats.

Got it, saw it and understood that you are converting for different programs the force field configuration files into two files. A program specific dictionary file

Yes.
and a program specific CML database file that refers to entries in the specific dictionary file.
Yes. It needn't be in a database and most are probably littered around the Internet. We'd like to get them all collected in the Blue Obelisk. They aren't big.
 
And this combination can than be used in a structure description that in turn refers to the CML database file that contains the force field parameters.

Yes.

Dictionary + force-field + structure ==> program input
Ok, don't want to raise again a discussion ;) but I more thought the following

Program specific force field dictionary + force field parameters in a CML file (that was what I meant with database file) + structure ==> coordinate and topology input

+ MD dictionary    |     converter (for specific MD program) ==> input for specific MD program

This scheme can be adapted to QM and DC.


 
This is why I described my setup above. As I am the one who is responsible for the development and usage this was the pratical question of 'How to do it?'. I think I got my answer with your example of the parm file, if the summary I wrote of your current work writing the amber dictionary is correct (cf. answer referring to your link). 

Yes. We have tended to do these things on an ad hoc basis - when there is a need we work on the specific subdomain. Since I have a colleagues working on Amber I thought I would see if I could solve that.

Still got the question how I include a dictionary into my CML? Meaning the real, actual xml string to refer to the dictionary itself. So that a 'dictRef' in my document would point to a real entry in a real file.

The idea is that

<cml xmlns:amber="http://www.xml-cml.org/dictionary/amber/" ... >

will be both an identifier AND an address. So the name defines the dictionary and the name also acts an address. This is TimBL's great idea of confalting names and addresses. It works when you have control over the server, fails when you migrate.

<property dictRef="amber:parm94" .../>

will point to a dictionary entry

http://www.xml-cml.org/dictionary/amber#parm94 ... >
That makes it a lot easier. And as long as I am developing on my machine a could use local URIs.

A further one is if the CML dictionaries Joe pointed me to (by directly downloading them from the xml-cml.org) are all that are available (compchemDict.cml,  molecule.cml,  property.cml,  propertyG03.cml,  unitTypeDict.cml)?
You are currently refactoring the CML dictionaries. Is there a way to get access to them? To those changed one, not the one I already downloaded.

They will be posted on the xml-cml site very soon. Just the CASTEP one to start with. Expect the others over a few days.
 
If I would come up with some suggestions and maybe requirements to the CML or the dictionaries it would be reasonable to base these questions on top of those refactored dictionaries and not outdated ones (Just to comment and clarify the aim of my questions.).

Yes. Assume we are starting from scratch on most dictionaries
Ok. I will post then the ideas we collected for MD and create the proposal (the initial dictionary file based on that ideas).

 

If it is not yet possible to get access to those refactored dictionaries can you estimate till when they are presentable to the public. On the basis of that we (MoSGrid) can decide if we can afford to wait or follow the old standards and the information basis I received and gathered up to now.

Anything you have gather on dictionaries will need to be refactored. Think of this as releasing CML Dictionary V0.1 in a week or two

This is all very exciting. Did you meet Christoph Steinbeck when he was at the Biozentrum in Koeln? He is now at EBI near Cambridge.

P.
Yes. I had a pratical course in bioinformatics at the group of Prof. Schomburg 4 or 5 years ago. Christoph lead a subgroup there. I had the occasion to develop a little bit at the JChemPaint project. I was there when they started to refactor the JChemPaint application. At least I am still member of the sourceforge JChemPaint and CDK projects. :D

Cheers,

Sebastian

 


--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069


Peter Murray-Rust

unread,
Feb 15, 2011, 1:35:44 PM2/15/11
to Sebastian Breuers, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan
On Tue, Feb 15, 2011 at 5:53 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:
Am 15.02.2011 18:23, schrieb Peter Murray-Rust:
Dictionary + force-field + structure ==> program input
Ok, don't want to raise again a discussion ;) but I more thought the following

Program specific force field dictionary + force field parameters in a CML file (that was what I meant with database file) + structure ==> coordinate and topology input

Yes - I was just being rather fuzzy

+ MD dictionary    |     converter (for specific MD program) ==> input for specific MD program

This scheme can be adapted to QM and DC.

Yes

There may be other inputs such as machine parameters, memory, timings, etc.

Yes. Assume we are starting from scratch on most dictionaries
Ok. I will post then the ideas we collected for MD and create the proposal (the initial dictionary file based on that ideas).

Good. I think you'll find that in a day or so that you'll be fully in touch
 

P.
Yes. I had a pratical course in bioinformatics at the group of Prof. Schomburg 4 or 5 years ago. Christoph lead a subgroup there. I had the occasion to develop a little bit at the JChemPaint project. I was there when they started to refactor the JChemPaint application. At least I am still member of the sourceforge JChemPaint and CDK projects. :D

Great I am meeting Christoph for a beer and will give greetings
 

Sebastian Breuers

unread,
Feb 16, 2011, 3:02:24 AM2/16/11
to Peter Murray-Rust, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan
Am 15.02.2011 19:35, schrieb Peter Murray-Rust:


On Tue, Feb 15, 2011 at 5:53 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:
Am 15.02.2011 18:23, schrieb Peter Murray-Rust:
Dictionary + force-field + structure ==> program input
Ok, don't want to raise again a discussion ;) but I more thought the following

Program specific force field dictionary + force field parameters in a CML file (that was what I meant with database file) + structure ==> coordinate and topology input

Yes - I was just being rather fuzzy

+ MD dictionary    |     converter (for specific MD program) ==> input for specific MD program

This scheme can be adapted to QM and DC.

Yes

There may be other inputs such as machine parameters, memory, timings, etc.
Ok, the metadata stuff, which should be also put in a separate dictionary, I think. Currently it is in the compChemDict.cml and using the shortcut 'md' which I would change to 'meta' to avoid confusion with MD.


Yes. Assume we are starting from scratch on most dictionaries
Ok. I will post then the ideas we collected for MD and create the proposal (the initial dictionary file based on that ideas).

Good. I think you'll find that in a day or so that you'll be fully in touch
 

P.
Yes. I had a pratical course in bioinformatics at the group of Prof. Schomburg 4 or 5 years ago. Christoph lead a subgroup there. I had the occasion to develop a little bit at the JChemPaint project. I was there when they started to refactor the JChemPaint application. At least I am still member of the sourceforge JChemPaint and CDK projects. :D

Great I am meeting Christoph for a beer and will give greetings
Ok. That would be nice. Maybe he is remembering me. :)

Have a nice trip to Italy and enjoy your stay.

Sebastian Breuers

unread,
Feb 16, 2011, 11:22:19 AM2/16/11
to Peter Murray-Rust, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan
Hello,

I've got a new question concerning the definition of force fields:

As I understood up to now the configuration files of specific programs will be converted in the program specific dictionary and the database file referencing to the dictionary and containing the parameters of the force field.

I was wondering if there is something like a superclass dictionary that contains terminologies that are common to force fields in general, e.g. force constants or harmonic potentials, dihedral specifications or electrostatic parameters of certain atom types?

Kind regards,

Sebastian


Am 15.02.2011 19:35, schrieb Peter Murray-Rust:

Peter Murray-Rust

unread,
Feb 16, 2011, 3:18:40 PM2/16/11
to Sebastian Breuers, quixote-...@googlegroups.com, Joe Townsend, ste...@zib.de, mw...@cam.ac.uk, Jens Thomas, Pablo Echenique, Weerapong Phadungsukanan
On Wed, Feb 16, 2011 at 4:22 PM, Sebastian Breuers <breu...@uni-koeln.de> wrote:
Hello,

I've got a new question concerning the definition of force fields:

As I understood up to now the configuration files of specific programs will be converted in the program specific dictionary and the database file referencing to the dictionary and containing the parameters of the force field.

I was wondering if there is something like a superclass dictionary that contains terminologies that are common to force fields in general, e.g. force constants or harmonic potentials, dihedral specifications or electrostatic parameters of certain atom types?

That's out overall plan.  We expect to collect a number of code-spcific dictionaries and then to see what commonaility can be taken to a superset.
 
P.
Reply all
Reply to author
Forward
0 new messages