Re: Median age v. range of uncertainties

Rayfo...@aol.com

unread,

May 14, 2014, 3:25:59 PM5/14/14

to ox...@googlegroups.com

Hello,

I found this paper instructive:

RADIOCARBON, Vol 49, Nr 2, 2007, p 393–401 © 2007 by the Arizona Board of Regents on behalf of the University of Arizona
IS IT POSSIBLE TO FIND A GOOD POINT ESTIMATE OF A CALIBRATED
RADIOCARBON DATE?
Adam MichczyÒski.

I'm not at all sure if a 2-sigma type of description is meaningful in the context of a non Gaussian or a multi modal PDF. I would always use the 95.4% probability range as the range where the Sample data calendar date (the 'correct' date) lies with 95.4% probability.

The unknown True calendar date lies in that range with probability 1 or 0

Point dates whether Mean, Median or Mode are seductive, but very misleading.

regards

Ray

In a message dated 14/05/2014 18:38:09 GMT Daylight Time, ox...@googlegroups.com writes:

Hello,

Simple questions. There seems to be a tendency of a few people in various disciplines to report the mean Bayesian date (mu) as "the correct date". Doing so, of course, ignores the reality that the entire range of uncertainties most likely contains the correct date. Can anyone point me to a citable paper or two that discuss this? Also, within a 2-sigma spread of uncertainties, what is the statistical probability that the mean date is correct.

Allen

--
You received this message because you are subscribed to the Google Groups "OxCal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oxcal+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Allen W

unread,

May 14, 2014, 9:28:05 PM5/14/14

to ox...@googlegroups.com

Thanks, Ray.

Douglas Harro

unread,

May 15, 2014, 1:48:24 AM5/15/14

to ox...@googlegroups.com

Here is another paper you may find useful.

Telford, R.J., Heegaard and Birks
2004 The Intercept is a Poor Estimate of Calibrated Radiocarbon Age. The Holocene 14(2)

Abstract: Intercept-based methods of generating a point estimate of a calibrated radiocarbon date are very
popular, but exhibit undesirable behaviour. They are highly sensitive to the mean of the radiocarbon date and
to adjustments of the calibration curve. Other methods give more stable results. The weighted average of the
probability distribution function is recommended as the best central-point estimate, but more consideration
should be given to using the full probability distribution rather than a point estimate in developing age-depth
models.

Best,

Doug

On Wednesday, May 14, 2014 12:25:59 PM UTC-7, Ray wrote:

MILLARD A.R.

unread,

May 15, 2014, 4:47:17 AM5/15/14

to ox...@googlegroups.com

An interesting discussion. I'm replying here to several people at once.

> From: Allen
> Sent: 14 May 2014 18:38
>
> Simple questions. There seems to be a tendency of a few people in
> various disciplines to report the mean Bayesian date (mu) as "the
> correct date". Doing so, of course, ignores the reality that the entire
> range of uncertainties most likely contains the correct date. Can
> anyone point me to a citable paper or two that discuss this? Also,
> within a 2-sigma spread of uncertainties, what is the statistical
> probability that the mean date is correct.

I've recently published a paper which covers standards for reporting radiocarbon dates:

Millard AR. 2014. Conventions for reporting radiocarbon determinations. Radiocarbon 52:555-559 DOI: 510.2458/2456.17455.
https://journals.uair.arizona.edu/index.php/radiocarbon/article/view/17455

On point estimates and ranges the conventions recommend:
"Point estimates of dates (e.g. median calibrated age) cannot represent the uncertainties involved. If point estimates are reported, this should be in addition to probability ranges. Where calibration produces more than one age range, all the ranges or a summary of their overall span should be reported. A probability such as 68% or 95% should be given for each range, and the terms “1-sigma” and “2-sigma” should not be used to describe calibrated dates as they are not meaningful in this context."

In addition it is always necessary to be aware that with multimodal probability distributions, the mean or median may fall in a region of low probability density and therefore outside the 95% highest posterior density (HPD) range. And *any* point estimate in a continuous distribution has zero probability, though one might argue that the way we treat the results of calibration we are actually talking about the probability of a particular year.

> From: Gold
> Sent: 14 May 2014 18:58
>
> For earthquake recurrence for example you need a number, not a
> probability, to feed in to that etc. If forced to use an age, I use
> the mode as a number that at least reflects the sometimes skewed PDF.
> Some use only the full 2 sigma range, which to my mind throws away all
> the information in the PDF, which I think does have some meaning.
> The common problem with terrestrial ages is the multiple peaks in the
> PDF really can't be dealt with an any nice way that I know of

There is no reason why models to compute things like earthquake recurrence cannot be adapted to account for the full uncertainty in the dating. It may be that the programs currently used require exact dates, as was once the case in age-depth modelling using regression techniques (see the Telford et al paper already mentioned), but I'd be surprised if the methods could not be adapted.

I agree that the mode makes sense as point summary because it at least can be defined in terms of probability, but as a standalone summary it inevitably suffers from ignoring uncertainty. I find the 95% hpd range (it's not 2-sigma) is a useful summary for mental consideration of dates, as I'm not good as manipulating a pdf in my head.

> From: Allen
> Sent: 14 May 2014 19:53
>
> Yes, I agree that using uncertainties makes things complicated, and
> that, in some cases, it makes sense to use mu. However, I've seen
> researchers argue that two dates are not the same when, in fact, their
> uncertainties overlap.

That is why I find a 95% hpd range is the most useful summary because overlap is usually fairly clear even to those who are not used to manipulating probabilities.

> Another puzzle. Oxcal typically reports the mean (mu) of the calibrated
> date. To the contrary, when I use the latest Calib calculator, it
> reports only the median value in the calout.csv.

OxCal reports what you ask it to. You can choose mean and/or median. Under Format > Show > Summary statistics the options are mu, sigma, median. (And comparison of sigma with 68% ranges easily demonstrates that a 68% probability range is not a 1-sigma range.)

> From: Ray
> Sent: 14 May 2014 20:26
>
> I would always use the 95.4% probability range as the range where the
> Sample data calendar date (the 'correct' date) lies with 95.4%
> probability. The unknown True calendar date lies in that range with
> probability 1 or 0

I think you are confusing a 95% confidence interval from classical statistics, where the philosophy of the approach leads to the unknown true value lying in the range with probability 1 or 0, with a Bayesian 95% probability range (aka credibility range), where the philosophy of the approach means that the location of the unknown true value is being expressed as a probability (strength of belief) of 95%. With calibration, and especially with models, we are dealing with the latter.

Best wishes

Andrew
--
Dr. Andrew Millard
e: A.R.M...@durham.ac.uk | t: +44 191 334 1147
w: http://www.dur.ac.uk/archaeology/staff/?id=160
Senior Lecturer in Archaeology, Durham University, UK

Rayfo...@aol.com

unread,

May 15, 2014, 12:04:02 PM5/15/14

to ox...@googlegroups.com

Andrew,

Thanks for the comment re True calendar age.

What I was implying and probably put badly, was that the Radiocarbon Determination, based on sampling, and assumed near Gaussian, may or may not contain the True Radiocarbon Determination, in a classical sense. If it is not contained in the range, then I doubt the subsequent Bayesian analysis will correct that since it is unknown.

The Bayesian Credible Interval, as I understand it, refers to the sampled data, and gives a degree of belief that the sample 'correct' calendar value lies in the given posterior distribution range. The True calendar Age remains unknown. Or is that incorrect?

Best wishes

Ray

MILLARD A.R.

unread,

May 15, 2014, 1:00:30 PM5/15/14

to ox...@googlegroups.com

> From: Ray
> Sent: 15 May 2014 17:04

>
> Thanks for the comment re True calendar age.
>
> What I was implying and probably put badly, was that the Radiocarbon
> Determination, based on sampling, and assumed near Gaussian, may or may
> not contain the True Radiocarbon Determination, in a classical sense.
> If it is not contained in the range, then I doubt the subsequent
> Bayesian analysis will correct that since it is unknown.
> The Bayesian Credible Interval, as I understand it, refers to the
> sampled data, and gives a degree of belief that the sample 'correct'
> calendar value lies in the given posterior distribution range. The True
> calendar Age remains unknown. Or is that incorrect?

What is the difference between the "sample 'correct' calendar value" and "True calendar Age"?

The gaussian distribution has support on the whole of the real line so the unknown true radiocarbon content always has some non-zero probability density even if it is far from the mean.

In a Bayesian analysis each unknown value has a probability distribution which expresses belief in each possible value being the true value. A Bayesian posterior credible Interval refers to degree of belief in the unknown falling within that range given the observed data. Equivalently, there is a prior credible interval representing beliefs before observing data. In principle this applies to all unknowns, so in a stratigraphic model we could update our beliefs about the radiocarbon content of a sample as well as its calendar date; but in practice we don't bother to to this for every single unknown in the model.

Bruce R. Bachand

unread,

May 15, 2014, 1:45:50 PM5/15/14

to ox...@googlegroups.com

Andrew makes an important point. The great thing about Bayesian models is that they can always be redone with better data. The perpetual problem in my field is that the priors or inputs, i.e., the assays or determinations, are not fully described and justified--few see that as an integral part of the Bayesian analysis. Researchers spend all their time with the models and no time assessing the viability of the dated samples themselves. As far as HPD modal values are concerned (the original question), I only place stock in them when the most rigorous scientific, descriptive, and statistical standards are met, that is, when the priors are fully described, when their positions in the model are justified, and when some sensitivity testing has been conducted. All these things affect my "believability."

--
You received this message because you are subscribed to the Google Groups "OxCal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oxcal+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

--

Bruce R. Bachand, MSLS, PhD
Research Associate

New World Archaeological Foundation
800 SWKT
Brigham Young University
Provo, UT 84602

Rayfo...@aol.com

unread,

May 17, 2014, 6:07:30 AM5/17/14

to ox...@googlegroups.com

Hello Andrew, Bruce,

Thanks for continuing to engage! If you will indulge me a little more;

I have no problems with the Credible Interval concept of a Bayesian model with constraints influencing the data. However with a single date calibration, there is no Posterior, only the Likelihood viz: (I have created a wiggle free calibration curve for clarity)

The table view shows as 'Unmodelled'.

The Model Specification shows no Constraints, no Priors and no Groupings are used in calculating the calibrated range. (however, perhaps a uniform prior is implicit in the calibration?)

As I understand it, the unknown parameter of the True Radiocarbon isotope ratio is substituted by SAMPLED data that gives a SAMPLE middle value and error term whose uncertainty is modeled by a Gaussian distribution. Unless we have an infinite set of independent samples we cannot know if the SAMPLE uncertainty contains the True Radiocarbon Isotope ratio. It does or doesn't with no probability attached. We calibrate as above. If perchance the unknown True Radiocarbon Ratio is NOT in the Sample range then I can not see how it can then appear in the calibrated range as a True Calendar age. However the SAMPLE calibrated Calendar age does. I called it the SAMPLE 'correct 'age. Hence my comment.

Of course given further information, e.g. the missing Constraints, Priors and Groupings, a Bayesian Model produces a posterior plot with Credible Interval.

Best wishes

Ray

MILLARD A.R.

unread,

May 17, 2014, 7:50:36 AM5/17/14

to ox...@googlegroups.com

Ray,

In the figure you have probabilities. You don’t get those with classical likelihood (confidence interval), but you do with a Bayesian posterior credible interval. This is a Bayesian calibration, assuming a uniform prior on the calendar age over the whole span of the calibration curve. The likelihood is represented by the red bell curve. That the process used for ‘simple’ calibration of radiocarbon date is a Bayesian one was pointed out many years ago:

Dehling H, van der Plicht J. 1993. Statistical problems in calibrating radiocarbon dates. Radiocarbon 35:239-244.

Buck CE, Kenworthy JB, Litton CD, Smith AFM. 1991. Combining archaeological and radiocarbon information: a Bayesian approach to calibration. Antiquity 65:808-821.

What do you mean by sampled data here? I had thought you were talking about the sample (charcoal, bones etc.) that was dated, but perhaps you are taking about the MCMC numerical samples?

Bruce R. Bachand

unread,

May 17, 2014, 2:56:56 PM5/17/14

to ox...@googlegroups.com

It appears I said something wrong and therefore confusing in my last post (been some years since I've talked about this material). The lab dates (i.e., the assays or determinations) are independent of the priors. The priors are all the beliefs and observations one has about the samples and sample contexts from which the dates are derived. The point I tried to make is that researchers often don't scrutinize the nature of their samples enough to accurately gauge how closely they can be expected to match the event or temporal phenomenon of interest. In short, they neglect what's often called in my field, archaeological systematics or formation processes.

Ray's point seems directed at the vagaries of radiocarbon dating itself, something that's always evolving and therefore critical to remain current with. Bayesian radiocarbon dating is a superb example of how science is an art. When is it bad art, and when is it good? When is it being used effectively and with accuracy and when is it being abused? These are issues to consider as the method goes mainstream.

Rayfo...@aol.com

unread,

May 17, 2014, 4:16:44 PM5/17/14

to ox...@googlegroups.com

Hello Andrew, Bruce,

Thanks again for the discussion.

To clarify, when referring to 'sampled data' I mean the Radiocarbon isotope sampling that gives the estimate of 14C ratio. Being a sampling measurement process, the Sample middle value and error term may or may not include the TRUE population 14C ratio. We work with the Sampled Data and assume its uncertainty is near Gaussian. When we calibrate using Bayesian methods, if the Sampled Data (the Standardized Likelihood?) does NOT include the TRUE 14C population ratio (it may or may not), then I fail to see how it will appear in the Posterior distribution. The scheme only addresses the data we have, not the unknown parameter we may not have. Classical statistics would suffer similarly, but Bayesian allows further knowledge to be incorporated as it comes in.

In the Calibrated Plot I agree that the red bell curve shows the likelihood, however when switching on and off the 'likelihood' selection, it is the black calibrated bell curve that appears, presumably representing the 'Standardized Likelihood'?

So, given the Data, the Bayesian analysis produces a Credible Interval for the calibrated calendar age, but I think the assumption is that the True 14C ratio is included in the sampled data, which it may not be. I just have difficulty in understanding how it is claimed that the TRUE calendar age is in the interval when it is not represented in the Sample data. I could live with the TRUE Sampled Calendar age.

I have the Buck et al paper thanks, I'll see if I have the Dehling et al.

Best wishes

Ray

In a message dated 17/05/2014 12:51:04 GMT Daylight Time, a.r.m...@durham.ac.uk writes:

Ray,

In the figure you have probabilities. You don’t get those with classical likelihood (confidence interval), but you do with a Bayesian posterior credible interval. This is a Bayesian calibration, assuming a uniform prior on the calendar age over the whole span of the calibration curve. The likelihood is represented by the red bell curve. That the process used for ‘simple’ calibration of radiocarbon date is a Bayesian one was pointed out many years ago:

Dehling H, van der Plicht J. 1993. Statistical problems in calibrating radiocarbon dates. Radiocarbon 35:239-244.

Buck CE, Kenworthy JB, Litton CD, Smith AFM. 1991. Combining archaeological and radiocarbon information: a Bayesian approach to calibration. Antiquity 65:808-821.

What do you mean by sampled data here? I had thought you were talking about the sample (charcoal, bones etc.) that was dated, but perhaps you are taking about the MCMC numerical samples?

Best wishes

Andrew
--
Dr. Andrew Millard

e: A.R.M...@durham.ac.uk | t: +44 191 334 1147
w: http://www.dur.ac.uk/archaeology/staff/?id=160

Senior Lecturer in Archaeology, Durham University, UK

From: ox...@googlegroups.com [mailto:ox...@googlegroups.com]

Sent: 17 May 2014 11:08
To: ox...@googlegroups.com
Subject: Re: Median age v. range of uncertainties

Hello Andrew, Bruce,

MILLARD A.R.

unread,

May 19, 2014, 12:45:51 PM5/19/14

to ox...@googlegroups.com

> From: Ray
> Sent: 17 May 2014 21:17

>
> To clarify, when referring to 'sampled data' I mean the Radiocarbon
> isotope sampling that gives the estimate of 14C ratio. Being a
> sampling measurement process, the Sample middle value and error term
> may or may not include the TRUE population 14C ratio. We work with
> the Sampled Data and assume its uncertainty is near Gaussian.

When we make an observation, the likelihood always includes the true value because the full likelihood covers all possible values, and for each of them gives the probability density of observing the data if that were the true value. If you reduce the likelihood to a confidence region, such as a 95% CI, then it may or may not include the true value.

I think you are confusing the frequentist approach to p-values, where our measurement is considered to be one of an infinite series of repeated measurements, and is thus a probabilistic sample from that distribution of observations, with Bayesian approaches where the observation is regarded as given and the probabilistic statements are made about the true value.

> In the Calibrated Plot I agree that the red bell curve shows the
> likelihood, however when switching on and off the 'likelihood'
> selection, it is the black calibrated bell curve that appears,
> presumably representing the 'Standardized Likelihood'?

I think OxCal's terminology is confusing here.

> ... I

> just have difficulty in understanding how it is claimed that the TRUE
> calendar age is in the interval when it is not represented in the
> Sample data.

But a Bayesian credible interval does not claim the true value MUST lie in the region, it claims that with a certain probability, and that probability uses the likelihood across all possible measurement values. For a simple calibration there is a (possibly very small) non-zero probability for every calendar age. When there is a more complex model then the prior may lead to certain calendar ages having zero probability, and that is where Bruce's point about careful scrutiny of models is important because a region with zero prior probability will have zero posterior probability whatever the data indicate. So although we use that type of prior it doesn't always fully represent our prior beliefs, which usually allow a small probability that the model can be wrong in assigning zero prior probability.

Rayfo...@aol.com

unread,

May 20, 2014, 4:08:24 PM5/20/14

to ox...@googlegroups.com

Hello Andrew,

Thanks for the continuing discussion and in particular for bringing my attention to the Dehling, Van der Plicht paper (DVdP).

From reading it, I realise you may have formed the opinion that I am confusing (or conflating) Frequentist and Bayesian statistical methods. Hence you directed me to:

“That the process used for ‘simple’ calibration of radiocarbon date is a Bayesian one was pointed out many years ago:

Dehling H, van der Plicht J. 1993.”

Whilst I am grateful for the guidance, I don’t believe I ever inferred that the ‘simple’ calibration of radiocarbon date should be anything other than Bayesian.

My problem is with the Tautology ‘The True value of the radiocarbon ratio is unknown, it is in or out of the interval’ i.e. the statement ‘it is in or out of the interval’ is always true.

In DVdP they say:

“From a classical viewpoint, the parameter h (the True 14C ratio value) either lies in the confidence region Co or not, but we cannot determine which. Neither could a Bayesian approach compute the probability that h is an element of Co, because this depends essentially on the prior probability.”

(my emphasis).

And:

“As observed, error curves drawn for yr BP and cal AD/BC axes are likelihood functions for x and h. Thus the 2s interval on the BP axis, along with the level set on the AD/BC axis, are likelihood-based confidence regions with confidence level 95%. Now the confidence region for x covers the true (calendar) age x if and only if |f(x)-y|<= 2s. But this has again a probability of 95% since y follows a normal distribution with mean x and variance s^2.”

I take (covers the true (calendar) age x if and only if |f(x)-y|<= 2s) to imply that the ‘Tautology’ is still there.

Subsequently DVdP address the treatment of the Bayesian Model with additional prior information which refines the Credible Interval. I read somewhere (and I can’t recall where) that the assumption made in the calibration is that the True 14C age lies in the measurement distribution. That seems understandable and quite valid and really helps in getting to a conclusion. Effectively it is a way of dealing with the Tautology elephant.

I think all I was saying was that it is still there.

Best Wishes

Ray

Reply all

Reply to author

Forward