units

1 view
Skip to first unread message

Tamás Beke-Somfai

unread,
Oct 19, 2010, 1:33:50 PM10/19/10
to Quixote project on QC databases - Developers
Dear Weerapong,

I will keep on pasting these two-three at a time if that is ok.
If it is more convenient to have all of the corrected ones in one file
than I could go that way either.


ZPVE
With this I am a bit confused, so I made two versions (depending which
you extract)

1THIS: Zero-point vibrational energy 3226266.5 (Joules/Mol)
2THIS: 771.09619 (Kcal/Mol)

or is it a value at a third place that you extract?

1THIS:
<entry id="property.zpve"
term="zpve"
definition="Zero-Point Vibrational Energy (ZPVE)"
description="The total sum of energies of all vibrational
mode of a molecule at absolute zero (0 K)."
cmlx:superclass="property"
cmlx:status="unreviewed">
<cmlx:template>
<cmlx:scalar dataType="xsd:double"
units="units:J.mol-1"/>
</cmlx:template>
</entry>

2THIS:
<entry id="property.zpve"
term="zpve"
definition="Zero-Point Vibrational Energy (ZPVE)"
description="The total sum of energies of all vibrational
mode of a molecule at absolute zero (0 K)."
cmlx:superclass="property"
cmlx:status="unreviewed">
<cmlx:template>
<cmlx:scalar dataType="xsd:double"
units="units:kcal.mol-1"/>
</cmlx:template>
</entry>




Polarizability:

<entry id="property.polarizability"
term="polarizability"
definition="Polarizability"
description=""
cmlx:superclass="property"
cmlx:status="unreviewed">
<cmlx:template>
<cmlx:array dataType="xsd:float" length="6"
delimiter="" units="units:Bohr**3"/>
</cmlx:template>
</entry>


Dipole Moment:

<entry id="property.dipole"
term="dipole"
definition="Dipole"
description=""
cmlx:superclass="property"
cmlx:status="unreviewed">
<cmlx:template>
<cmlx:vector3 dataType="xsd:float" delimiter=""
units="units:Debye"/>
</cmlx:template>
</entry>


/Tamas

Peter Murray-Rust

unread,
Oct 19, 2010, 4:49:00 PM10/19/10
to quixote-...@googlegroups.com
On Tue, Oct 19, 2010 at 6:33 PM, Tamás Beke-Somfai <bektom...@gmail.com> wrote:
Dear Weerapong,

I will keep on pasting these two-three at a time if that is ok.
If it is more convenient to have all of the corrected ones in one file
than I could go that way either.

This is very good. Does it makes sense to use the Wiki??



--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Weerapong Phadungsukanan

unread,
Oct 19, 2010, 6:56:42 PM10/19/10
to quixote-...@googlegroups.com
On Tue, Oct 19, 2010 at 6:33 PM, Tamás Beke-Somfai <bektom...@gmail.com> wrote:
Ideally, it should not be matter which one of those we get as long as units is correct. The units I gave in my dictionary is units="molar_energy:unknown" which isn't quite right. It should be units="unitsType:molar_energy" or units="units:kcal.mol-1" or units="units:J.mol-1". Feel free to change it anyway you like. As long as it well documented and clear, this should be how we develop Quixote in the early stage.




Polarizability:

     <entry id="property.polarizability"
           term="polarizability"
           definition="Polarizability"
           description=""
           cmlx:superclass="property"
           cmlx:status="unreviewed">
           <cmlx:template>
               <cmlx:array dataType="xsd:float" length="6"
delimiter="" units="units:Bohr**3"/>
           </cmlx:template>
       </entry>


Dipole Moment:

       <entry id="property.dipole"
           term="dipole"
           definition="Dipole"
           description=""
           cmlx:superclass="property"
           cmlx:status="unreviewed">
           <cmlx:template>
               <cmlx:vector3 dataType="xsd:float" delimiter=""
units="units:Debye"/>
           </cmlx:template>
       </entry>

I agree with all your changes.

The reason I did not put units for many entry is that those are not property I am familiar with and I have no use of them at the moment. Extending dictionary requires collaboration from people who understand it well. Since we have not make it clear how we are going to deal with units, this problem should be noted down as one of the issue.

Weerapong

Tamás Beke-Somfai

unread,
Oct 20, 2010, 5:15:45 AM10/20/10
to Quixote project on QC databases - Developers

> I agree with all your changes.
>
> The reason I did not put units for many entry is that those are not property
> I am familiar with and I have no use of them at the moment. Extending
> dictionary requires collaboration from people who understand it well. Since
> we have not make it clear how we are going to deal with units, this problem
> should be noted down as one of the issue.
>
> Weerapong

Yeah, so I think we should clearly give a policy on our preference for
units and then let the people convert it. Maybe it would be easy and
straightforward to include a link for some of the big online 'metric
unit converters'.
I would suggest that in case we have multiple units we should have
this priority:
1. atomic units (since we do QC calcs.)
2. SI units
3. Whatever we have

So J.mol-1 would be preferred over kcal.mol-1 (even if in the e.g. US
people still use kcal/mol).


>This is very good. Does it makes sense to use the Wiki??
I think going to wiki with this is a good idea. -I have not checked
yet how we should do it.

I will keep on expanding with the units as even if we do not need
them, it does not hurt and might be useful to have on the long run.
Weerapong, do I get it correct that I should be able to modify the
code at http://bitbucket.org/gigadot/semantic-compchem/src/tip/src/main/resources/org/xmlcml/cml/semcompchem/dictionary/property.xml

and spare some cutting pasting for us? I am not that familiar with
that. Should I use the 'embed' option to edit?


/Tamas

Weerapong Phadungsukanan

unread,
Oct 20, 2010, 5:25:50 AM10/20/10
to quixote-...@googlegroups.com
You have to clone the repository to your local directory from http://bitbucket.org/gigadot/semantic-compchem using mercurial. Do you have any experience using mercurial for version? It is a source version control similar to svn and git.

After that you can edit them locally, commit changes to your local repository and then push chnages from your local repositoy to bitbucket.


and spare some cutting pasting for us? I am not that familiar with
that. Should I use the 'embed' option to edit?


You can not make any changes from web interface.
 
Weerapong

Tamás Beke-Somfai

unread,
Oct 20, 2010, 1:22:12 PM10/20/10
to Quixote project on QC databases - Developers
Alright, I see. I have used cvs for some years, but nothing version
control recently.
I will check these out, thanks

/Tamas

On okt. 20, 11:25, Weerapong Phadungsukanan <wp...@cam.ac.uk> wrote:
> On Wed, Oct 20, 2010 at 10:15 AM, Tamás Beke-Somfai <bektomquix...@gmail.com
> >http://bitbucket.org/gigadot/semantic-compchem/src/tip/src/main/resou...
>
> You have to clone the repository to your local directory fromhttp://bitbucket.org/gigadot/semantic-compchem<http://bitbucket.org/gigadot/semantic-compchem/src/tip/src/main/resou...>using

Peter Murray-Rust

unread,
Oct 20, 2010, 6:43:13 PM10/20/10
to quixote-...@googlegroups.com
I have been working with Units for a long time and know that there is no single acceptable solution. In some cases the orginal units matter to the authors - for example Volts per mm are not the same psychologically as KV per metre. Where the author has given units then I think they should be respected. It's a highly religious issue.

Some codes - such as CASTEP standardise on SI units throughout. Other use atomic units, others use a mixture

On Wed, Oct 20, 2010 at 10:15 AM, Tamás Beke-Somfai <bektom...@gmail.com> wrote:


Yeah, so I think we should clearly give a policy on our preference for
units and then let the people convert it. Maybe it would be easy and
straightforward to include a link for some of the big online 'metric
unit converters'.
I would suggest that in case we have multiple units we should have
this priority:
1. atomic units (since we do QC calcs.)
2. SI units
3. Whatever we have

So J.mol-1 would be preferred over kcal.mol-1 (even if in the e.g. US
people still use kcal/mol).

Many chemical codes expect Anstroms and converting to a.u wilol cause masses of errors, especially if people know what the original units are

Also some unit connversions are not trivial. What is a kcal? are you sure the authors are all using the same calorie? Because if not you make it worse.

It's possible to do all this conversion when we create RDF. That's when qwe can add extra information to help searches, but I suggest not before

>This is very good. Does it makes sense to use the Wiki??
I think going to wiki with this is a good idea. -I have not checked
yet how we should do it.

I will keep on expanding with the units as even if we do not need
them, it does not hurt and might be useful to have on the long run.
Weerapong, do I get it correct that I should be able to modify the
code at http://bitbucket.org/gigadot/semantic-compchem/src/tip/src/main/resources/org/xmlcml/cml/semcompchem/dictionary/property.xml

and spare some cutting pasting for us? I am not that familiar with
that. Should I use the 'embed' option to edit?


/Tamas

P.

Tamás Beke-Somfai

unread,
Oct 21, 2010, 3:21:45 AM10/21/10
to Quixote project on QC databases - Developers


On okt. 21, 00:43, Peter Murray-Rust <pm...@cam.ac.uk> wrote:
> I have been working with Units for a long time and know that there is no
> single acceptable solution. In some cases the orginal units matter to the
> authors - for example Volts per mm are not the same psychologically as KV
> per metre. Where the author has given units then I think they should be
> respected. It's a highly religious issue.
>
> Some codes - such as CASTEP standardise on SI units throughout. Other use
> atomic units, others use a mixture
> > Many chemical codes expect Anstroms and converting to a.u wilol cause
>
> masses of errors, especially if people know what the original units are
>
> Also some unit connversions are not trivial. What is a kcal? are you sure
> the authors are all using the same calorie? Because if not you make it
> worse.
>
> It's possible to do all this conversion when we create RDF. That's when qwe
> can add extra information to help searches, but I suggest not before
>

Yes, I do agree with the comment on people's religious approach.
What I wanted to say simply, is that in order to make our life
simpler, and the extracted data clearer,
if a property is present in the output file with more than one unit
(for instance the example ZPVE is present in different regions and in
three different units in a gaussian.log file)
then we should simply extract one of them.
However, we could give a link to unit converters, but SHOULD NOT
convert them by ourselves, because as you said that could be a source
of errors and also mistrust from the users.
Maybe on the long run the best is if we slowly expand and extract all
of these data in different units and provide it.- I am not so sure.

Reply all
Reply to author
Forward
0 new messages