as discussed with some of you during the ICBO conference, I did some
review and work towards sequencing.
It turns out that the restrictions were a bit off. More details can be
found in the log.
I have added DNA sequencing as a defined class. I had currently to rely
on 'information content entity' as a specified output in the absence of
sequence (or read for that matter) but they will soon be available. I
have also distinguished DNA sequencing by use of DNA ligase from DNA
sequencing by use of DNA polymerase.
Created classes were pyrosequencing, chain termination sequence, SOLiD
sequencing.
The classifier runs smoothly and interestingly places 'genotyping' as
currently defined as a kind of DNA sequencing which is correct.
It places PCR-SSCP assay as a kind of DNA sequencing but this is down to
the fact we don't yet enough shades under information entity (but these
are in the pipelines).
There is still a lot of work to carry out. But before going further, I
wanted to give a heads up. It would be nice if OBI could cover
sequencing technology better since it is such a hot topic
I'd like now to add various instruments and their suppliers (hence the
work on organization).
I will also need a number of materials (luciferase, various enzymes and
chemical with role of reagent) + I will aslo need role/function for
primer, adaptors and add processes related to clonal amplification and
library construction.
And finally add metadata to those classes recently added.
All comment welcome
Great to see some of you during ICBO meeting, I think it is been a good
meeting for OBI.
--
Philippe Rocca-Serra, PhD
Technical Coordinator
www.ebi.ac.uk/net-project
The European Bioinformatics Institute email: ro...@ebi.ac.uk
EMBL Outstation - Hinxton direct: +44 (0)1223 492 553
Wellcome Trust Genome Campus fax: +44 (0)1223 492 620
Cambridge CB10 1SD, UK room: A3-141
--
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Obi-protocol-application-branch mailing list
Obi-protocol-ap...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obi-protocol-application-branch
Indeed. And thanks for getting this started. Not surprisingly, I have
some comments on the restrictions, with a mind to giving an idea where
I think some of the work is.
I don't like the has_agent relation. It is very ambiguous and I've
been trying to get it removed from RO. I see that you want to use it
to make the difference between sequencing by ligation and sequencing
by synthesis, but I think we need to find a different way. I will
think some about how.
The specified output: information content entity, is way too general.
As it is it could refer to getting a measurement of the mass of the
supplied dna, for example, or a count of cpg islands, or behavioral
assessments of the technicians who do the work.
Definitions are missing. I know you say that metadata is not complete,
but not having some indication of the scope of what you mean by the
terms makes it hard to offer precise suggestions for improving the
logical definitions. In particular, what are the boundaries of the
process - are you thinking about one in which a vial of dna is input
and a genome is output, or something at the chemical reaction level.
Is any kind of preparation included in these processes or not? Do
these processes included data transformations?
I believe that it should be added that these processes achieve planned
objective: sequence feature identification objective.
That's it for now.
Best,
Alan
> Obi-devel mailing list
> Obi-...@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/obi-devel
Thanks for the feedback. really helpful.
> I don't like the has_agent relation. It is very ambiguous and I've
> been trying to get it removed from RO. I see that you want to use it
> to make the difference between sequencing by ligation and sequencing
> by synthesis, but I think we need to find a different way. I will
> think some about how.
>
I've used it on purpose to prompt a reaction: Essentially what I wanted
to get at it the following:
Should the enzyme used in the sequencing reaction be described in the
same terms as the input DNA material?
I was looking at the definition of has_specific_input and it said the
following:
"The continuant realizes specified_Input_Role for that process. In
general, not all participants present at the beginning of the process
are specified_inputs."
Initially I used has_specific_input relation instead of has_agent, then
had second thoughts.
The DNA polymerase or DNA ligase does realize their role / function in a
sequencing process.
Side note:
I have used continuant 'DNA ligase' and continuant 'DNA polymerase
complex'. I am not too happy about the 'complex' thing so was thinking
of miroeting a 'DNA polymerase'
> The specified output: information content entity, is way too general.
> As it is it could refer to getting a measurement of the mass of the
> supplied dna, for example, or a count of cpg islands, or behavioral
> assessments of the technicians who do the work.
>
Like I said, I was aware of this (and pointed to the classification of
PCR-SSCP assay as a consequence of 'information content entity' as too
general)
View this as placeholders that will be refined as more additions are
made to Information Content entities.
There was also a technical reason, I wanted to confine all my additions
to the PlanandPlannedProcesses.owl at the time of the editing.
> Definitions are missing. I know you say that metadata is not complete,
> but not having some indication of the scope of what you mean by the
> terms makes it hard to offer precise suggestions for improving the
> logical definitions.
You are right, I will work on those today.
> In particular, what are the boundaries of the
> process - are you thinking about one in which a vial of dna is input
> and a genome is output, or something at the chemical reaction level.
> Is any kind of preparation included in these processes or not? Do
> these processes included data transformations?
>
I am with you here. I have very fine grained representations for some of
the techniques, really drilling down. The work actually pointed to
interesting issues you picked upon.
When describing new sequencing techniques, I have included references to
key steps. For instance using preceded_by 'immobilization' and
preceded_by 'amplification of a clone'
but i could not find a way specify the order in which those steps would
occur. Those steps allow to distinguish Helicos sequencing (single
molecule sequencing no amplification needed) from Solexa or Solid
methods where an emulsion PCR is used for amplification
Also, for a number of techniques, a sequence of subprocesses are
repeated (introduction of reagent mix, enzymatic reaction, washing,
imaging, clivage in a cycle which a run over and over)
How can we describe this 'motif' and these cycles ?
But more importantly, do we really need this level of detail ? hence the
scope issue. I've been conservative choosing to insist on the input (a
library of genomic DNA fragments, which needs to be added) and the
output (possibly images, then sequence reads). on this side, I am
confident that the discussion in IAO and SO will give us what we need
with a very good consistency.
> I believe that it should be added that these processes achieve planned
> objective: sequence feature identification objective.
>
+1. I overlooked this. will add.
Bjoern pointed out that a range of information might also require
attention in the realm of genome assembly in order to have the
capability to indicate 'redundancy and fold coverage or number of
contigs and so forth.
These are information that matters to Dawn Field and the people behind
the Genome Standard Consortium.
--
Philippe Rocca-Serra, PhD
Technical Coordinator
www.ebi.ac.uk/net-project
Hi Alan,
Thanks for the feedback. really helpful.
I've used it on purpose to prompt a reaction: Essentially what I wanted
> I don't like the has_agent relation. It is very ambiguous and I've
> been trying to get it removed from RO. I see that you want to use it
> to make the difference between sequencing by ligation and sequencing
> by synthesis, but I think we need to find a different way. I will
> think some about how.
>
to get at it the following:
Should the enzyme used in the sequencing reaction be described in the
same terms as the input DNA material?
I was looking at the definition of has_specific_input and it said the
following:
"The continuant realizes specified_Input_Role for that process. In
general, not all participants present at the beginning of the process
are specified_inputs."
Initially I used has_specific_input relation instead of has_agent, then
had second thoughts.
The DNA polymerase or DNA ligase does realize their role / function in a
sequencing process.
Those steps allow to distinguish Helicos sequencing (single
molecule sequencing no amplification needed) from Solexa or Solid
methods where an emulsion PCR is used for amplification
Also, for a number of techniques, a sequence of subprocesses are
repeated (introduction of reagent mix, enzymatic reaction, washing,
imaging, clivage in a cycle which a run over and over)
How can we describe this 'motif' and these cycles ?
But more importantly, do we really need this level of detail ? hence the
scope issue. I've been conservative choosing to insist on the input (a
library of genomic DNA fragments, which needs to be added) and the
output (possibly images, then sequence reads). on this side, I am
confident that the discussion in IAO and SO will give us what we need
with a very good consistency.
> I believe that it should be added that these processes achieve planned+1. I overlooked this. will add.
> objective: sequence feature identification objective.
>
Bjoern pointed out that a range of information might also require
attention in the realm of genome assembly in order to have the
capability to indicate 'redundancy and fold coverage or number of
contigs and so forth.
These are information that matters to Dawn Field and the people behind
the Genome Standard Consortium.
>
> yes, what is the issue here? We have the function catalytic_activity
>
The thing is that, based on the information I have collected from
manufacturers, I am pretty sure that DNA ligase (protein) is added (more
precisely it should be T4 phage DNA ligase))
and the DNA polymerase is added. A Complex as indicated by the
definition is comprised of 2 or more subunits. We may want to set
restrictions on those classes to formally distinguish
protein complex from the rest.
I think I am simply missing DNA polymerase in OBI at the moment,: it may
be imported from...well this is where it can be difficult to decide. or
just as for current DNA ligase, we create a class in OBI but I 'd rather
not assert this in OBI since I feel it could live happily in another
resource and we should mireot it.
>
> I am not sure I follow. If you say immobilization is preceeded_by
> amplification then you have said that immobilization is 1 and
> amplification is 2.
I am only saying that Sequencing is 'preceded_by immobilization' and
is_preceded_by 'amplification'. I guess I need to change to restriction to
preceded_by some ( 'immobilization' preceded_by some 'amplification')
>
> Those steps allow to distinguish Helicos sequencing (single
> molecule sequencing no amplification needed) from Solexa or Solid
> methods where an emulsion PCR is used for amplification
>
>
> They are two distinct process which could be represented by the following
>
> Helicos sequencing has_part single_molecule_sequencing
>
> Solexa has_part single_molecule_sequencing preceeded_by (has_part
> amplification)
>
>
> (no guarantee that actually reasons though :)
I am currently adding the different libraries (paired end ditag library
or single fragment library) to the biomaterial branch and I will need to
add 'library construction' as a planned process.
>
> > I believe that it should be added that these processes achieve
> planned
> > objective: sequence feature identification objective.
> >
> +1. I overlooked this. will add.
>
> Bjoern pointed out that a range of information might also require
> attention in the realm of genome assembly in order to have the
> capability to indicate 'redundancy and fold coverage or number of
> contigs and so forth.
> These are information that matters to Dawn Field and the people behind
> the Genome Standard Consortium.
>
>
> Can you present this information in the form of concrete case-studies
> and use-case please.
thanks for the input
P
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Hi FrankThe thing is that, based on the information I have collected from manufacturers, I am pretty sure that DNA ligase (protein) is added (more precisely it should be T4 phage DNA ligase))
yes, what is the issue here? We have the function catalytic_activity
and the DNA polymerase is added. A Complex as indicated by the definition is comprised of 2 or more subunits. We may want to set restrictions on those classes to formally distinguish
protein complex from the rest.
I think I am simply missing DNA polymerase in OBI at the moment,: it may be imported from...well this is where it can be difficult to decide. or just as for current DNA ligase, we create a class in OBI but I 'd rather not assert this in OBI since I feel it could live happily in another resource and we should mireot it.
I am only saying that Sequencing is 'preceded_by immobilization' and is_preceded_by 'amplification'. I guess I need to change to restriction to
I am not sure I follow. If you say immobilization is preceeded_by amplification then you have said that immobilization is 1 and amplification is 2.
preceded_by some ( 'immobilization' preceded_by some 'amplification')
I am currently adding the different libraries (paired end ditag library or single fragment library) to the biomaterial branch and I will need to add 'library construction' as a planned process.
Those steps allow to distinguish Helicos sequencing (single
molecule sequencing no amplification needed) from Solexa or Solid
methods where an emulsion PCR is used for amplification
They are two distinct process which could be represented by the following
Helicos sequencing has_part single_molecule_sequencing
Solexa has_part single_molecule_sequencing preceeded_by (has_part amplification)
(no guarantee that actually reasons though :)
thanks for the input
> I believe that it should be added that these processes achieve
planned
> objective: sequence feature identification objective.
>
+1. I overlooked this. will add.
Bjoern pointed out that a range of information might also require
attention in the realm of genome assembly in order to have the
capability to indicate 'redundancy and fold coverage or number of
contigs and so forth.
These are information that matters to Dawn Field and the people behind
the Genome Standard Consortium.
Can you present this information in the form of concrete case-studies and use-case please.
P
Hi Frank
Frank
Helicos sequencing has_part single_molecule_sequencing
thanks for the input
P
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day
trial. Simplify your report design, integration and deployment - and focus on
what you do best, core application coding. Discover what's new with
Crystal Reports now. http://p.sf.net/sfu/bobj-july
_______________________________________________
Obi-devel mailing list
Obi-...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obi-devel
------------------------------------------------------------------------------
I guess using 'DNA polymerase or DNA polymerase complex' on the
restriction would solve the problem.
> - Agree with Frank that the 'has specified ... ' relations should have been cleaned up and not refer to roles any more. I created a ticket for Larissa.
> - I thought at the workshop we wanted to create the 'has specified participant' relation, which was to be used for reagents / instruments etc. which are not necessarily present at the start of the process, and more importantly aren't the things transformed into outputs.
>
This is still fine for the time being. I will used those relations as
place holders and carry out the fixes as soon as those more refined
relations are phased in.
cheers
Philippe
--
Philippe Rocca-Serra, PhD
Technical Coordinator
www.ebi.ac.uk/net-project
The European Bioinformatics Institute email: ro...@ebi.ac.uk
EMBL Outstation - Hinxton direct: +44 (0)1223 492 553
Wellcome Trust Genome Campus fax: +44 (0)1223 492 620
Cambridge CB10 1SD, UK room: A3-141
--
cheers
--
Philippe Rocca-Serra, PhD
Technical Coordinator
www.ebi.ac.uk/net-project
Obi-devel mailing list
Obi-...@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/obi-devel
------------------------------------------------------------------------------