New issue report by mcourtot:
Please indicate the label for the new term
integer
Please provide a textual definition
An element of the infinite and numerable set {...,-3,-2,-1,0,1,2,3,...}.
Please add an example of usage for that term
Please provide any additional information below. (e.g., proposed position
in the IAO hierarchy)
synonym: whole number, when understood to include negative numbers and zero.
from http://en.wiktionary.org/wiki/integer
Issue attributes:
Status: New
Owner: mcourtot
Labels: Type-Term Priority-Medium
--
You received this message because you are listed in the owner
or CC fields of this issue, or because you starred this issue.
You may adjust your issue notification preferences at:
http://code.google.com/hosting/settings
Comment #1 by d...@georgetown.edu:
I find this definition rather unsatisfactory. Couldn't 'integer' be
defined as a
number that lacks a remainder when divided by 1?
Another issue pertaining to this term is one of representation. For
example, 16,
2*8 (two multiplied by eight), 16.0, and 2^4 (two to the fourth power) are
all
representations of the same number, which happens to be an integer. For
the use
case in mind, are they all considered integers, or just the string '16' ?
Tim Clark
Tim Clark
Director of Informatics, MassGeneral Institute for Neurodegenerative
Disease
Neurology Research Department, Massachusetts General Hospital
Instructor in Neurology, Harvard Medical School
617-947-7098 (mobile)
>Hi Barry, Just to clarify, are you suggesting that mathematical
>knowledge is a priori?
No. This is an issue of ontology not of epistemology. Perhaps it will
be clearer this way round: datatypes are types of data. Data are the
sorts of thing that can e.g. be stored in a computer. An integer is
not a type of data in this sense. Some data are representations of
integers. But not every datum that is a representation of an integer
is of datatype: integer. E.g. these 'two' and 'six divided by three'
and 'Tim's favorite integer' are representations of an integer, but
they are (presumably) not of datatype: integer.
BS
In general aren't there a lot of cases where computing technology
borrows a word and makes it a "term of art", including the term
"ontology" itself?
At 01:14 PM 11/6/2008, Alan Ruttenberg wrote:
>It might be worth having a look at the definitions of datatypes in
>the OWL 2 syntax draft to see whether we can piggy back off those definitions.
>
><http://www.w3.org/2007/OWL/wiki/Syntax#Datatype_Maps>http://www.w3.org/2007/OWL/wiki/Syntax#Datatype_Maps
>
I really doubt that this
>
>5.2 Datatypes
>
>
>
>Datatypes are entities that refer to sets of built-in values. Thus,
>datatypes are analogous to classes, the main difference being that
>the former contain literals (such as strings and numbers) rather
>than individuals. Datatypes are a kind of data ranges, which allows
>them to be used in restrictions. All datatypes have arity one. An
>ontology containing a datatype with a URI that is neither
>rdfs:Literal nor it belongs to the datatype map (defined in Section
>4) is syntactically invalid. The built-in datatype rdfs:Literal
>denotes any set that contains the union of the value spaces of all
>datatypes in the datatype map.
is a good starting point. (a) entities that refer are instances for
instances of names; datatypes, surely, are not instances, and they
are things that get referred to; (b) 'datatypes contain literals'
where 'classes contain individuals'; this is a step in the right
direction, I suppose, but then what are literals?:
>Literals represent values such as particular strings or integers.
>They are analogous to literals in RDF [RDF Syntax] and can also be
>understood as individuals denoting built-in values.
This supports my thesis that things of datatype:integer are not
integers, just as things of datatype:string are not strings. It would
be nice to know c) what 'represent' means; but also what sorts of
entities these things are which do the representing. (Particularly
since we have been told both (d) that there is a 'main difference'
between literals and individuals and (e) that literals 'can also be
understood as individuals' (sigh).
BS
>-Alan
>
>On Thu, Nov 6, 2008 at 12:09 PM,
><<mailto:codesite...@google.com>codesite...@google.com> wrote:
>
>Issue 4: data types - integer
><http://code.google.com/p/information-artifact-ontology/issues/detail?id=4>http://code.google.com/p/information-artifact-ontology/issues/detail?id=4
>
>New issue report by mcourtot:
>Please indicate the label for the new term
>
>integer
>
>Please provide a textual definition
>
>An element of the infinite and numerable set {...,-3,-2,-1,0,1,2,3,...}.
>
>Please add an example of usage for that term
>
>
>
>Please provide any additional information below. (e.g., proposed position
>in the IAO hierarchy)
>
>synonym: whole number, when understood to include negative numbers and zero.
>
>from
><http://en.wiktionary.org/wiki/integer>http://en.wiktionary.org/wiki/integer
>
>
>Issue attributes:
> Status: New
> Owner: mcourtot
> Labels: Type-Term Priority-Medium
>
>--
>You received this message because you are listed in the owner
>or CC fields of this issue, or because you starred this issue.
>You may adjust your issue notification preferences at:
><http://code.google.com/hosting/settings>http://code.google.com/hosting/settings
>
>
>
>
>
>I see. I was confused by the statement "...datatypes are human
>creations...integers however are not."
>
>In general aren't there a lot of cases where computing technology
>borrows a word and makes it a "term of art", including the term
>"ontology" itself?
Yes. But computer people are not always aware that a difference in
meaning is thereby created. (Thus for instance 'disease' in HL7
means: 'observation of disease').
BS
On Nov 6, 2008, at 16:08, "Werner Ceusters" <ceus...@buffalo.edu>
wrote:
>
> unsubscribe
>
>
> >
James
>The datatype integer (or any other)
'integer' is the name for the object, not for the symbol; we translate symbols
>are merely translations of the
>integer into some machine interpretable syntax, upon which operations
>can be performed as per the definition of the specification for that
>datatype. The integers Barry speaks of are the original 'human
>integers',
When I refer to integers, I mean entities which are outside the realm
of causality and independent of any use of corresponding symbols. I
don't think it makes sense to talk of 'human integers'. It does,
though, make sense to talk of human-created systems of symbols for
referring to integers (not sure which was the original one of these).
>the figure 12 for instance,
you mean the symbol '12' I take it?
(It is very important when dealing about information artifacts to get
into the habit of distinguishing entities from the expressions
referring to them by using quote marks for the latter. Thus Oxford is
the town. 'Oxford' is the name of the town.)
>whereas the datatype integer is
>some translation of that into some machine readable syntax.
I think you mean:
each instance of the datatype integer is some translation of some
human-produced integer symbol into some machine readable syntax.
> There is
>a relationship between the two which is a translation of from one into
>the other (and back again). To satisfy, you might well require
>integer and a datatype integer which is a translation of that integer
>(as per my definition above) and so on for the rest.
There seems to be a potential ambiguity in your use of 'datatype
integer' here, as between referring to the datatype, and referring to
an instance of the datatype.
BS
To aid my understanding, could you give me an example of an integer
that is outside the realm of causality and is independent of any use
of corresponding symbols? Surely the scope of IO is to capture the
way we represent information about something and not things we can't
represent without these symbols?
> I think you mean:
> each instance of the datatype integer is some translation of some
> human-produced integer symbol into some machine readable syntax.
Yes, the translation specification is captured by the class datatype
integer. That is the whole point of the class datatype integer,
presumably.
Thanks,
JM
> > When I refer to integers, I mean entities which are outside the realm
> > of causality and independent of any use of corresponding symbols. I
> > don't think it makes sense to talk of 'human integers'. It does,
> > though, make sense to talk of human-created systems of symbols for
> > referring to integers (not sure which was the original one of these).
>
>To aid my understanding, could you give me an example of an integer
>that is outside the realm of causality and is independent of any use
>of corresponding symbols?
3
> Surely the scope of IO is to capture the
>way we represent information about something and not things we can't
>represent without these symbols?
exactly; IAO (information artifact ontology) is about, for instance,
'3'; it is not about 3.
> > I think you mean:
> > each instance of the datatype integer is some translation of some
> > human-produced integer symbol into some machine readable syntax.
>
>Yes, the translation specification is captured by the class datatype
>integer. That is the whole point of the class datatype integer,
>presumably.
How can a class capture a specification?
And how far does the specification reach? Does it specify, e.g., how
we are to translate Ukrainian natural-language designations of integers?
BS
Agreed. And I presume when we say integer we are talking about '3'.
When I talk about datatype integer I am talking about '3' translated
into the machine readable form I have previously described.
> How can a class capture a specification?
For instance restrictions on the range of values it can take.
Specifications currently exists in IAO so I am not asking for anything
new.
> And how far does the specification reach? Does it specify, e.g., how
> we are to translate Ukrainian natural-language designations of integers?
It would represent the things we need it to represent that would allow
us to understand that this is an interpretation of an integer into
some machine form. Does the scope of IAO reach into your example? No
idea, although this is a perfectly valid use case of information.
So going back to the original definition proposed:
a datatype is a specification which defines a set of data values and
allowable operations on those values
I don't think datatype is_a type of data, it's a child of a data
specification. It is more than just a symbol, it is a symbol
associated with for example, a range of allowed values and operations.
"A datatype is_a (child of) data specification which defines a set of
data values and allowable operation on those values." I like this.
Cheers,
James
> > exactly; IAO (information artifact ontology) is about, for instance,
> > '3'; it is not about 3.
>
>Agreed. And I presume when we say integer we are talking about '3'.
No. When we say 'integer' [please note the careful use of quote
marks] we are referring to those things which can be summed, divided,
multiplied, decomposed into prime factors. When we refer to
'3'
we are talking about something that can be made of ink, projected on
a screen, written on a piece of paper, ...
>When I talk about datatype integer I am talking about '3' translated
>into the machine readable form I have previously described.
Would this not be a datum that is an instance of the datatype:integer?
> > How can a class capture a specification?
>
>For instance restrictions on the range of values it can take.
Specifications are one thing; restrictions are another thing; as is
seen in the fact that we can have specifications of restrictions.
>Specifications currently exists in IAO so I am not asking for anything
>new.
>
> > And how far does the specification reach? Does it specify, e.g., how
> > we are to translate Ukrainian natural-language designations of integers?
>
>It would represent the things we need it to represent that would allow
>us to understand that this is an interpretation of an integer into
>some machine form. Does the scope of IAO reach into your example? No
>idea, although this is a perfectly valid use case of information.
The scope of IAO is all information artifacts. Currently (as I
understand it) we are trying to specify what are the information
artifacts which are instances of datatype:integer (members of this
class, if you like). This specification should, surely, be quite
simple and quite stable; it should not change because we encounter
one day a need to translate data from Ukrainian free text.
>So going back to the original definition proposed:
>
>a datatype is a specification which defines a set of data values and
>allowable operations on those values
>
>I don't think datatype is_a type of data, it's a child of a data
>specification. It is more than just a symbol, it is a symbol
>associated with for example, a range of allowed values and operations.
But it is (on your view) a symbol. Seems wrong for me to assert, say, that
'the symbol "3" is a datatype'
But still, I will go along with you for the moment. You say
a datatype is a child of a data specification
This seems to be a problem, however, since parent-child relations
hold between types (classes ...) and you insist that datatypes are
not types. Can you clarify?
>"A datatype is_a (child of) data specification which defines a set of
>data values and allowable operation on those values." I like this.
So a specification of a datatype is a specification of a
specification? When people end up with results like this, this is
usually a sign that they have gone wrong in their thinking.
BS
> >> >> >><<mailto:codesite...@google.com>codesite-noreply@google
> .com> wrote:
> >> >> >>
> >> >> >>Issue 4: data types - integer
> >> >> >><http://code.google.com/p/information-artifact-ontology/issu
> es/>> det>>
I don't think this follows normal usage. 3 is an integer, but '3' is
usually said to be
a numeral - a numeral that denotes an integer.
This suggests a hierarchy:
datum -- well, this will be very tricky to define, but maybe some
information-like stuff that might be put into a computer and that is
meant, by someone, to denote and/or to be interpreted by some
process... I would include lists, tables, sentences... I think I might
defer to Barry, or to Brian Cantwell Smith
symbol (subclass of datum) -- a smallish, word-like datum...
again I'm not sure
numeral (subclass of symbol) -- a symbol that denotes a number
integer numeral (subclass of numeral) -- a numeral that
denotes an integer (could also be called 'integer datum')
Then a datatype would be any subclass of datum, and a datatype
specification would be an information artifact that specifies a
datatype.
Mathematical types, e.g. integer, as Barry says, would be a different story.
Now what's not clear to me is where interpretation happens. We would
usually say that '3' and 'III' are different numerals even though they
denote the same integer. Is the problem that we have a need to talk
about the quality of being a numeral that denotes the integer 3? E.g.
'3' has the-quality-of-denoting-3
'III' has the-quality-of-denoting-3
in the same way that you'd say that some piece of paper has
the-quality-of-being-blue (I'm not sure I'm using 'quality' correctly
but I hope that doesn't matter). Such a quality (generically dependent
continuant?) would be neither a number nor a numeral. I have no idea
what you'd call such a thing, or what you'd call a class of such
things. If we want to call such a class a "datatype" that's OK but
then we have to come up with a different name for what I called
"datatype" above.
>On Fri, Nov 7, 2008 at 10:26 AM, James Malone <james....@gmail.com> wrote:
> >
> >> exactly; IAO (information artifact ontology) is about, for instance,
> >> '3'; it is not about 3.
> >
> > Agreed. And I presume when we say integer we are talking about '3'.
>
>I don't think this follows normal usage. 3 is an integer, but '3' is
>usually said to be
>a numeral - a numeral that denotes an integer.
>
>This suggests a hierarchy:
>
>datum -- well, this will be very tricky to define, but maybe some
>information-like stuff that might be put into a computer and that is
>meant, by someone, to denote and/or to be interpreted by some
>process... I would include lists, tables, sentences... I think I might
>defer to Barry, or to Brian Cantwell Smith
> symbol (subclass of datum) -- a smallish, word-like datum...
>again I'm not sure
> numeral (subclass of symbol) -- a symbol that denotes a number
> integer numeral (subclass of numeral) -- a numeral that
>denotes an integer (could also be called 'integer datum')
>
>Then a datatype would be any subclass of datum, and a datatype
>specification would be an information artifact that specifies a
>datatype.
I like nearly all of the above, but I would say that a datatype is
not just ANY subclass of datum (otherwise 'either the numeral "3" or
the name "Bill Clinton"' would designate a datatype). Rather, a
datatype is a subclass of datum all of whose instances are of the
same type. This is not proposed as a definition, of course, since
there is something nigglingly circular about it; but it seems to me
to be intuitively right, nonetheless.
>Mathematical types, e.g. integer, as Barry says, would be a different story.
>
>Now what's not clear to me is where interpretation happens. We would
>usually say that '3' and 'III' are different numerals even though they
>denote the same integer. Is the problem that we have a need to talk
>about the quality of being a numeral that denotes the integer 3? E.g.
>
> '3' has the-quality-of-denoting-3
> 'III' has the-quality-of-denoting-3
The IAO will contain an instance-level relation most likely called
'is_about', for this.
Thus: '3' is_about 3, 'Oxford' is about Oxford, etc.
Barry
datum
-(child) datatype
-(child) datatype integer
And "32-bit integer" is that an instance of a "datatype integer"? So
into "datatype specification" I presume is the rules about allowed
values and operations? Or is "32-bit integer" an instance of
"datatype integer specification" and the corresponding instance of
datatype integer is "1000 0000 0000 0000..."?
It's getting there I think :)
James
>Ok just to expand then,
>
>datum
>-(child) datatype
> -(child) datatype integer
To say that a datatype is a child (is_a) of datum is not to say that
datatype is a child of datum.
To say that an animal species is a child of animal is not to say that
animal species is a child of animal.
An animal species is a child of animal is illustrated for instance by:
mus musculus is_a animal
i.e. every instance of mus musculus is an instance of animal
which is correct.
Animal species is a child of animal would mean, every instance of
animal species is an instance of animal, thus for example the species
mus musculus, which is an instance of animal species, would have to
be an instance of animal, which is incorrect.
>And "32-bit integer" is that an instance of a "datatype integer"?
Are you really sure you are using the quote marks correctly here?
BS
Every instance of datatype integer is_a instance of datum by my text,
that I believe is correct surely?
Is 1001 1001 is_a datum? If so then surely my hierarchy above holds.
>>And "32-bit integer" is that an instance of a "datatype integer specification"?
>
> Are you really sure you are using the quote marks correctly here?
32-bit integer as an instance of the class datatype integer
specification (for clarity I was using quotes to pull out the
important parts of the sentence).
JM
When you say A "child of" B, which I interpret to mean A "subclass of"
B, then you're saying
every A is a B. So the above says that every datatype is a datum,
which I do not believe to be sensible.
In my account, datum *is* a datatype because datatypes are classes.
It sounds as if you want datatypes to be logical individuals. There
are two ways to do this: go to a higher-order system where all classes
are individuals (not recommended), or give a different name to the
things that need to be classes and not individuals.
We could say that the classes are "data classes", and each has an
induced "datatype" object (or vice versa) - the class and the datatype
are "isoontic". "Datatypes" are not specifications, since you can have
two distinct specifications (different words, language, logic, etc.)
for the same datatype.
Then we would have:
datum = the class (of information artifacts?) whose members are datums (data)
symbol = the subclass of datum whose members are symbols
numeral = ....
integer numeral = ...
datatype = the class (of ????) whose members are datatypes
integer datatype = the datatype to which integer numerals belong ??
It's not obvious to me that we need to talk about datatypes as individuals, but
I'm sure that's due to ignorance.
> And "32-bit integer" is that an instance of a "datatype integer"?
The datatype 32-bit integer numeral would be an instance of the class
of datatypes.
Any particular 32-bit integer numeral would "have" that datatype, and would
be a member of the class of things that have that datatype (the 32-bit
integer numeral class,
as opposed to the 32-bit integer numeral datatype).
> into "datatype specification" I presume is the rules about allowed
> values and operations? Or is "32-bit integer" an instance of
> "datatype integer specification" and the corresponding instance of
> datatype integer is "1000 0000 0000 0000..."?
OK, we can shape this however we like, but can't really go much
farther without use cases, so I suggest bringing those forth.
In the theory of programming languages you find both extensional
theories of types (where a datatype is like a class, defined by its
extension) and intensional ones (as you've given above, where two
datatypes can have the same extension but behave differently under the
same operations).
I don't think it is a good idea to attach intensional information to
types, since combining
such types into theories is a nightmare. The object-oriented
programming community has attempted this path and IMO has failed, and
this has had a terribly destructive effect on the field of programming
language engineering. The specifications you describe above should not
be seen as specifying the type; they should be seen as specifying the
type *among other things* such as operations and other types.
Of course, if you don't like what I just said, you can still get
clarity by having distinct notions of "extensional datatype" and
"intensional datatype"...
This is my misunderstanding of the class you described as "datum" then.
I agree we need use cases for this so I can understand how real
examples would be represented in practice to see if I would fully
agree with this; it's too easy to be lost in discussions of abstract
things that don't exist in reality that I can not observe when I am
trying to model real world use cases. I am forced into being a
pragmatist in this sense. Melanie has a bunch from a workshop we had
this week so I guess we can look at those.
Thanks for explanations JR.
JM
> >>datum
> >>-(child) datatype
> >> -(child) datatype integer
> >
> > To say that a datatype is a child (is_a) of datum is not to say that
> > datatype is a child of datum.
> > To say that an animal species is a child of animal is not to say that
> > animal species is a child of animal.
>
>Every instance of datatype integer is_a instance of datum by my text,
>that I believe is correct surely?
that seems to me to be correct
>Is 1001 1001 is_a datum?
'1001 1001' instance_of datum
> If so then surely my hierarchy above holds.
>
>
> >>And "32-bit integer" is that an instance of a "datatype integer
> specification"?
> >
> > Are you really sure you are using the quote marks correctly here?
>
>32-bit integer as an instance of the class datatype integer
That looks wrong, since '32-bit integer' almost certainly designates
a class. Hence it should be
32-bit integer is_a datatype:integer
Every instance of 32-bit integer is an instance of datatype:integer
[I think this is correct.]
BS
The above looks good to me.
>It's not obvious to me that we need to talk about datatypes as
>individuals, but
>I'm sure that's due to ignorance.
>
> > And "32-bit integer" is that an instance of a "datatype integer"?
>
>The datatype 32-bit integer numeral would be an instance of the class
>of datatypes.
This will work only if we can find a way to avoid instances of
instances. To avoid:
A. 10...01 instance of 32-bit integer numeral.
B. 32-bit integer numeral instance of datatype
we need to rewrite B as follows:
B*. datatype:32-bit integer numeral instance of datatype.
The phrase '32-bit integer numeral' is the name for the class of
32-bit integer numerals. The phrase 'datatype:32-bit integer numeral'
is the name for the datatype which they all share.
I think.
>Any particular 32-bit integer numeral would "have" that datatype, and would
>be a member of the class of things that have that datatype (the 32-bit
>integer numeral class,
>as opposed to the 32-bit integer numeral datatype).
Good. We seem to be in agreement.
BS
>On Nov 7, 10:39 am, "James Malone" <james.mal...@gmail.com> wrote:
> > Ok just to expand then,
> >
> > datum
> > -(child) datatype
>
>I think Jonathan's suggestion was not that datatype was a subclass of
>datum. Rather any subclass of datum is a "datatype". I.e. "datatype"
>is like "class"
We have the type: datum
We have the extension of this type: class of data
We have subtypes of this type: datatypes (datatype:integer,
datatype:string, etc.)
We have the extensions of these subtypes ({..., '-2', '-1', '0', '1',
'2', ...}, {x | x is a string}, etc.)
I would like to resist, in IAO, the axiom that any subclass of the
class of data defines a corresponding datatype which it is the extension of.
BS
Very good, I am happy to agree.
Questions I still have:
- Are datatypes classes (perhaps classes of a special kind)? If not,
why not - what practical benefit accrues from distinguishing between a
datatype (presumably an individual) and its extension?
- Assuming datatypes are not classes, are they extensional - that is,
can two distinct datatypes have the same extension? If so, why - what
benefit accrues from allowing such distinctions?
- Are datatypes purely syntactic, or do they capture some aspect of
meaning (or throw away some distinctions that could be made via
syntax, such as '3' vs. 'III')?
My opinion is that all such questions should be answered by appeal to
use cases. My prejudice is that nonextensional datatypes would be a
nightmare, but I would welcome evidence that they are needed.
Jonathan
Datatypes are types, which have extensions which are classes.
One example of an advantage of this view is that we know so many
cases where people were confused because they identified types with
their extensions.
Another is: Science is about types, not about extensions
Another is: it allows us to do justice to the fact that we often
know, and need to convey to others, lots of things about the type,
while knowing and needing to convey very few things about the
extensions. Thus we know lots of things about the datatype:integer,
but the extension of this datatype in this actual world is a
constantly changing very ugly long and gappy almost random set of
numerals scattered through the data artifacts of the world.
>- Assuming datatypes are not classes, are they extensional - that is,
>can two distinct datatypes have the same extension? If so, why - what
>benefit accrues from allowing such distinctions?
It is logically possible, I suppose, that two distinct datatypes have
the same extension. Thus suppose we have a world in which there is a
very small class of data artifacts, listed here:
1
2
[these are two numerals]
One data scientist identifies these two artifacts as instances of the
datatype:integer; the other identifies them as instances of the
datatype:string. They mean very different things; yet the extensions
of these two datatypes in that world are identical.
>- Are datatypes purely syntactic, or do they capture some aspect of
>meaning (or throw away some distinctions that could be made via
>syntax, such as '3' vs. 'III')?
In distinguishing datatype:integer from datatype:string some minimal
reference to semantics is involved.
>My opinion is that all such questions should be answered by appeal to
>use cases. My prejudice is that nonextensional datatypes would be a
>nightmare, but I would welcome evidence that they are needed.
If you view datatypes as extensional, can you describe for me the
datatype:integer?
Is it the class of all actual numerals? (A class which is constantly
changing its membership?)
BS
>Jonathan
>
>