These are:
Digital Quality
Digital Entity
Binary Digital Entity
Binary Executable
Digital Document
Text Based Digital Document
There is discussion of having a relation is_encoded_as and a number of
classes representing encodings (synonym "data structures") such as
lists, arrays, binary trees, etc.
Distinctions worth making, in this vicinity:
The sense of encodings or data structures are that they determine how
information entities they structure are concretized as specifically
dependent continuant, in such a way as a mechanisms which can
manipulate/access the information specifically dependents can support
a certain set of operations (with given time complexity).
(A nagging feeling here - there are layers of encodings, typically. On
a hard disk one has the logical blocks, for which the operation "next"
is quick, but this can be further encoded as a mapping to several
disks, in which the operation "next" is sometimes quick, and sometimes
possibly not.)
As discussed in the OBI meeting, this is different from structures
used in simulations/models such as network models. The distinction is
that in those cases there is a mapping of such operations directly
into the domain that is being modeled - for instance in a network
model of protein protein interaction, "nodes" represent proteins and
"edges" represent processes, such as binding, and "functional
proximity" is computed by some function on network proximity. The
sense of encoding above has no suggestion that there are analogies to
be made *from the structure, to the entities that the information
content they structure is about*.
Moreover I think some care needs to be taken to make sure that we also
don't confuse them with composite data items. For instance, we may
have a number of measurements that were taken at different times. The
information content entities that are these measurements may or may
not be encoded as a list, or an array, or any of a number of other
data structures.
Your thoughts about both the prospect of deprecating the above terms,
and about ideas about "encodings" are solicited.
Thanks,
Alan
One way would be to reincorporate them into OBI.
> OBI currently has a class "data representational model" (Work in
> progress. Currently: alt term data structure, data structure
> specification, definition: "Data representational model is an
> information content entity of the relationships between data items. A
> data representational model is encoded in a data format specification
> such as for cytoscape or biopax.") that we created for the purpose of
> representing the structure data can have. Maybe we should discuss with
> the DENRIE branch on how best to coordinate efforts?
This is clearly in the scope of IAO. I'm leery of further process
discussion. We've got enormous overlap between OBI and the IAO. OBI
folks should buck up and get on with business.
That said, you are of course welcome to initiate anything you want to discuss.
We are thinking about data structure for IAO. I'll bug Jonathan again.
I sent out a note about this on the IAO list recently. The thing to be
careful about is the different between models and structure.
FWIW, I don't think the definition above works well.
I have been meaning to add a qc test that checks for use of obsolete
terms. Will help with cleanup of properties as well.
> To add to the list of "encodings", how do we deal with character
> encoding? (utf-8, latin-1...)
I don't know yet. But following the idea of naming the specifications
(as instances), we can name the specification of the spec that defines
the character sets.
-Alan