Next steps are to post these at the Wiki page for branch core terms
and to start an email discussion on each with regard to status,
issues, and proposals.
Cheers,
Chris
Just to chip in :) I think I agree with what Melanie and Allyson are
suggesting. I believe it makes sense to separate the information and
its encoding, as the same _information_ could be encoded by multiple
encodings. It would also appear to make sense at this stage to think
of narrative object as a defined class as Ally suggests.
To continue Ally's example you could take it further and defined a
particular schema for the report, such as,
MyReport encoded_in (textFormat and has_schema BBSRC_project_report_schema)
In that the BBSRC_project_report_schema can define the text formating
such as chapter headings/references styles etc. This is just a quick
example and I have not thought through distinction such as the schema
for XML specification and an XML_schema for a particular XML_document
i.e MAGE, if there is a distinction or not.
Frank
--
Frank Gibson, PhD
http://peanutbutter.wordpress.com/
Also, where does software go? seems like it would fit more with report
than the others.
Chris
MyReport is an instance of report.
A report is an information content entity that is a statement of the
results of an investigation, or of any matter on which definite
information is required, made by some person or body instructed or
required to do so.
Types of reports include patent content.
A patent content is a report for the purpose of obtaining a license
from a government conferring for a set period the sole right to make,
use, or sell some process or invention.
Reports have parts that are statements about intent and
interpretations of investigations.
Conclusion is a part of a report that is a statement about the outcome
of an investigation.
Then narrative objects could be reports and parts of reports.
Report parts can be encoded in different formats: text format, figure
format, table format
So adjusting the current hierarchy, this would result in:
information content entity
information entity about a realizable
data format specification
report format specification
text format
narrative object
report
patent content
part of a report
conclusion
Cheers,
Chris
>
> More thoughts on these to try and make some progress on these core
> terms.
Thanks Chris :)
>
>
> MyReport is an instance of report.
> A report is an information content entity that is a statement of the
> results of an investigation, or of any matter on which definite
> information is required, made by some person or body instructed or
> required to do so.
> Types of reports include patent content.
> A patent content is a report for the purpose of obtaining a license
> from a government conferring for a set period the sole right to
> make, use, or sell some process or invention.
>
> Reports have parts that are statements about intent and
> interpretations of investigations.
> Conclusion is a part of a report that is a statement about the
> outcome of an investigation.
> Then narrative objects could be reports and parts of reports.
>
I would keep your proposal, removing "part of report", saying just
that conclusion is a narrative object. I don't think we always need to
have a full report when we are making a conclusion?
> Report parts can be encoded in different formats: text format,
> figure format, table format
I would keep text format directly under data format specification, and
suggest not adding report format specification. The text format can be
used to encode my protocol for example and so on.
So that would give something like:
information content entity
information entity about a realizable
data format specification
text format
narrative object
report
patent content
conclusion
Actually, current data format are instances of data format
specification, so text would probably deserve the same treatment. I
remember there was a discussion about instances vs subclasses, but I'm
afraid I forgot the resolution. Could somebody refresh my memory?
Thanks,
Melanie
I really liked this part of Chris email:
Here are things that seem like each other:
1) report, journal article, patent
2) hypothesis, conclusion, result
3) figure, text, table
4) pdf, html page
I would translate these into categories as follows:
1 = purpose of information for the author and audience
2 = relation of information to reality, according to the author
3 = format of information, as seen by humans
4 = format of information, as seen by computers
With that I would draw the hierarchy as follows (slightly different from
the other proposals)
information content entity
report (1)
journal article
patent application
grant progress report
patient record
proposition (2)
conclusion
hypothesis
result
information entity about a realizable
human format specification (3)
text format
figure format
audio format
table format
digital format specification (4)
html spec
pdf spec
Note that would be combined. There can be a patent application encoded
according to the pdf spec, which has a part describing a result
formatted as a Figure.
- Bjoern
--
Bjoern Peters
Assistant Member
La Jolla Institute for Allergy and Immunology
9420 Athena Circle
La Jolla, CA 92037, USA
Tel: 858/752-6914
Fax: 858/752-6987
http://www.liai.org/pages/faculty-peters
>
> I like this so far. Here is my input, which definitely can be
> further improved.
>
> I really liked this part of Chris email:
>
> Here are things that seem like each other:
> 1) report, journal article, patent
> 2) hypothesis, conclusion, result
> 3) figure, text, table
> 4) pdf, html page
>
> I would translate these into categories as follows:
>
> 1 = purpose of information for the author and audience
> 2 = relation of information to reality, according to the author
> 3 = format of information, as seen by humans
> 4 = format of information, as seen by computers
>
I like the report (1) and proposition (2) part.
>
> With that I would draw the hierarchy as follows (slightly different
> from the other proposals)
>
> information content entity
> report (1)
> journal article
> patent application
> grant progress report
> patient record
> proposition (2)
> conclusion
> hypothesis
> result
I think we all agree on the above proposal? Maybe we should start
working on their definitions, starting with report and proposition?
In order to get us started ( and knowing they probably need more
work ;) ):
1. report (would replace current narrative object)
A report is an ICE describing findings of some individual or group. It
usually is made up of a set of propositions, and can be written,
spoken or encoded digitally. Its content is reflective of an
investigation, and it's purpose is to inform an audience.
(from http://www.google.ca/search?q=define%3Areport, Bjoern, and
current definition of narrative object)
2. proposition
A proposition is an ICE that affirms or denies something and is either
true or false. A proposition represents the relation of the
information to reality according to the author.
alternative term: statement
(from http://www.google.ca/search?q=define%3Aproposition and Bjoern)
(side note: those 2 terms would probably go to IAO, and OBI would deal
with specific subclasses.)
> information entity about a realizable
> human format specification (3)
> text format
> figure format
> audio format
> table format
> digital format specification (4)
> html spec
> pdf spec
>
> Note that would be combined. There can be a patent application
> encoded according to the pdf spec, which has a part describing a
> result formatted as a Figure.
I'm not sure we need to split 3 and 4.
The advantage of splitting them is that if we were to have the 2
relations (labels t be discussed of course) encoded_according_to and
formatted_as we could have clear ranges (report /proposition
encoded_as digital format specification and report/proposition
formatted as human format specification) I am not sure we need both
relations.
I don't have a strong opinion about that and I don't think it should
hold us back.
Other thing which I think is related and we may want to address at the
same time:
I am also wondering about representing structure of the files: we had
a discussion with Richard about how to say what columns in file would
represent (similar to what Pierre asked at http://groups.google.com/group/information-ontology/browse_thread/thread/ffcc3981960dc04b)
. Any suggestion how we would deal with "column" for example? Would we
say a table format has column but not text? I guess it depends on the
definitions we choose for text format (I assume we don't mean .txt
files here, but "unstructured written document")
Thanks,
Melanie
I'm fine with your definitions for report and proposition.
I do favor splitting the types of format specifications as Bjoern
suggests.
Thanks,
Chris
Frank
The html source code may be text, but an html document viewed by a
browser is not (e.g. only setting background color = pink isn't much of
a read).
- Bjoern
Frank
If you are talking about a figure in publishing terms then
A floating block, also called a figure, in writing and publishing is
any graphic, text, table or other representation that is unaligned
from the main flow of text. Use of floating blocks to present pictures
and tables is a typical feature of academic writing, including
scientific and articles and books. Floating blocks are normally
labeled with a caption or title that describes its contents and a
number that is used to refer to the figure from the main text. A
common system divides floating block into two separately numbered
series, labeled figure (for pictures, diagrams, plots, etc.) and
table.
Frank
I see the distinction - for my part I was confused by the "as seen by
human" and "as seen by computer".
I think "as seen by computer" is what we are currently calling data
format specification, i.e. the file format, encoding.
The "as seen by human" is equivalent to "a visual representation (of
an object or scene or person or abstraction) produced on a surface"
and may possibly go with other graph terms (like diagram etc)
Currently in IAO (see http://code.google.com/p/information-artifact-ontology/issues/detail?id=8&can=1)
:
-report figure
--- report graph
------ Venn diagram
------ survival curve
------ heatmap
------ histogram
------ dendrogram
------ scatterplot
------ dot plot
------ contour plot
------ density plot
report figure: A report figure is a report display element that has
some aspect of illustration, but may be a composite of figures,
images, and other elements
report graph: A report graph is a report figure that presents one or
more tuples of information my mapping those tuples in to a two
dimensional space in a non arbitrary way.
Result would be: (with the numbers as initially assigned by Bjoern)
- information content entity
---- report (1)
------------ journal article
------------ patent application
------------ grant progress report
------------ patient record
---- proposition (2)
------------ conclusion
------------ hypothesis
------------ result
---- information entity about a realizable
------------ report figure (3)
------------------ report graph
---------------------- dot plot
---------------------- [..]
------------------- text format
------------------- figure format (there is an editor note to report
figure: "I prepended the 'report ' to make it clear that we mean parts
of reports here. We may want a more generic version of 'figure', in
which case this would become a defined class - figure and part_of some
report")
------------------- audio format
------------------- table format
------------ data format specification (4)
html spec
pdf spec
(note: html, pdf, xml are instances of data format specification - I
still don't remember why we decided instances and not subclasses :) )
How does that look?
Melanie
Frank
>>>>>>>>>>>>>>>> "encoded_in/represented_by".
Melanie: Yes, the IAO terms clearly belong there. How about 'report
element format' instead of 'report figure' as the label for the upper
class? audio format at least does not fit well into this. For the same
reason, it cannot be limited to visual, and 'on the surface. We can
'report element format' is a format in which information is presented
and consumed by a human being. that consumes the information
report element format
figure format
------ Venn diagram
------ survival curve (? too content specific? Could be x-y line graph)
------ heatmap
------ histogram
------ dendrogram
------ scatterplot
------ dot plot
------ contour plot
------ density plot
table format
text format
audio format
movie format
Needs some more work (keep 'format' or not, clarify relations).
- Bjoern.
Don not take the consumed by human approach - only deal with the
specification of what it is, not what its used for.
Frank