Sep 10, 2021, 5:18:33 PM9/10/21
cross-posted to sci.stat.math where there is little traffic of late.
I hope that what follows "LECTURE MODE" is useful enough that
the reader will excuse my self-indulgence for going on at length.
On Fri, 10 Sep 2021 11:48:20 +1100, Peter Moylan
>On 10/09/21 10:31, Mark Brader wrote:
>> Tak To:
>>> The units are respectively "degree Celsius" and "degree
>>> Fahrenheit". They are both ordinals (marks on a scale) and
>>> cardinals (distances between pairs of marks).
>> No, both words are being used wrongly.
>> "Cardinal" refers to a number that tells how many of something
>> exist. It is always either a non-negative integer or an infinite
>> "Ordinal" refers to a number that tells the position of one
>> particular thing in a sequence of things, by giving the number of
>> things -- always a cardinal number -- from the start of the sequence
>> to the one in question.
>> In the languages I know about, the fact that a number is ordinal is
>> expressed by modifying its name. For example, the cardinal number
>> "six" or "6" corresponds to the ordinal "sixth" or "6th", or if that
>> word "six" was in French, then to "sixième" or "6e".
>> Neither one is relevant either to temperature readings or to
>> distances; these are not constrained to be integers.
>In the case of temperature the distinction is between "degrees Celsius"
>and "Celsius degrees". The first is a point on a scale, and the second
>is an interval.
>The two are related by simple arithmetic operations (addition and
>subtraction). Multiplication and division are meaningful when applied to
>intervals, but not when applied to points on a scale, except when it
>happens to be a scale with an absolute zero (which Celsius doesn't have).
Counter-example: One star may be "twice as hot" as another
and F, C or K does not matter because, given the precision
of estimates, the nominal zero is close enough to absolute zero.
Zero degrees centigrade is the freezing point of water; it is an
absolute zero in relation to the heat added to ice which is at
Ratio vs. interval vs. ordinal (ordered).
I'm not sure whether all professional statisticians are clear on
the notion that the intervals in "scaling" depend on the context
and what you are comparing to. The important underlying idea
for the usual testing by ANOVA is that there are "equal intervals"
between scores. When someone intuitively refers to multiples like
twice as much" of something, it implies (reversing the logic) that
there IS some unmentioned value that serves as an "absolute
I suggest, for an example, that some balmy temperature serves
as zero so that an extra 5 degrees C or 10 degrees F is "hot",
and twice as much increase makes it "twice as hot" -- probably
on the heat-discomfort scale that compensates for relative humidity.
The short-sighted error is to believe that WHAT you are
measuring is all that you need to know, /because/ units of Time
or Distance (or Temperature) have "inherently equal" intervals.
John Tukey recommended that any time your largest "natural"
score (having some zero) is 10 times the smallest, you should
consider whether a transformation will improve the analysis.
You keep in mind what constitutes "equal intervals" for your data.
"Counts" sound like they should create natural, equal intervals.
However, as counts arise in the world, they very often come
out with Poisson distributuions; and taking the square root of
counts is often the best starting point for analyses when there
is that 10-fold range.
Distances for spread of disease in epidemiology are modeled
on the reciprocal of the values in meters or kilometers.
Miles per gallon (US) and Liters per 100 miles (Europe) are
implicitly reciprocal, though both "look" like they could have
equal intervals. The latter works better in most analyses I've
seen (that is, has a better "equal-interval" nature).
Bio-chemical levels (hormones, whatever) are often log-transformed
at the start of analyses; they often represent geometric or
exponential growth of /something/.
Proportions (P) bounded at 0 and one are often analyzed these
days by the "logit", which is the log of P/(1-P).
The so-called "non-parametric approach" to statistical analysis
most often starts out by ranking the ordered scores; then it
treats the differences between ranks as equal. There are
so-called "exact" tests which may be better in tiny samples
with no ties; for large samples ANOVAs performed on the
ranks, as scores, work as well as (and often better than)
the author's approximations from pre-computer days.
For "ranks" where both ends are meaningful, rankings
can be converted to percentiles, and then to logits.
For ranks where #1 matters most, the log of the rankings
improves the distance between scores, though not very
precisely. For instance, the implied gap between #1 and #2
is much closer in meaning to the gap between 40 and 80
than to the gap between 40 and 41 or 80 and 81.