A question about modelled data - PDF

585 views
Skip to first unread message

Andrew

unread,
Aug 9, 2017, 4:24:12 AM8/9/17
to OxCal

Hoping someone can explain the following?


I have a site in New Zealand with stratigraphy and a series of dates.


When model some of the dates in Oxcal using the sequence, boundary, phase - phase - phase - end method, the PDF for the first phase are 'squeezed up’ into a tighter likelihood. My question is why does the model seem to force the PDF to the youngest end of the age distribution, closest to the subsequent phase likelihood, rather than, say, to an older section of the likelihood distribution. 


Does anyone have any idea as to the mathematical reason it does this? While im writing this im wondering if it has anything to do with options im not using such as Tau-boundary, or Sigma Boundary? Neither of which I have the faintest clue of when it would be suitable to use these or what they do to a model.


Ive pasted the Oxcal code below of the example I’m working on at the moment, and you can see that the initial “Boundary start 1” likelihood "squeezes up”, which forces the IBP_clearance phase to the late end of its PDF. As it turns out the archaeological contexts do suggest that this may most closely represent the actual deposition, however, I’m most interested in being able to account for this tendency of the model behaviour, and any advice on manipulating it   


Any clues?


With thanks,

Andrew Hoffmann.



CODE USED -


Plot()

{

 Sequence("model 2")

 {

  Boundary("start 1");

  Phase("IBP_clearance")

  {

   Curve("ShCal13","ShCal13.14c");

   R_Date("NZA60321", 360, 22);

   R_Date("NZA60799", 387, 19);

   R_Date("NZA60802", 381, 19);

  };

  Boundary("end 1");

  Boundary("start 2");

  Phase("F46_midden")

  {

   Curve("ShCal13","ShCal13.14c");

   R_Date("NZA60221", 316, 21);

   R_Date("NZA60800", 279, 20);

  };

  Boundary("end 2");

  Boundary("start 3");

  Phase("midden I & J")

  {

   Curve("ShCal13","ShCal13.14c");

   R_Date("Wk-45431", 335, 16);

   R_Date("Wk-45432", 298, 15);

   R_Date("Wk-45433", 256, 17);

   R_Date("Wk-45434", 290, 15);

   R_Date("Wk-45435", 298, 16);

  };

  Boundary("end 4");

 };

};

Rayfo...@aol.com

unread,
Aug 9, 2017, 5:51:07 AM8/9/17
to ox...@googlegroups.com
Hi Andrew H.
 
The A model index (74) is over 60, so the model as set up is acceptable.  You have a set of three phases that are Sequential, so given the dates and given the model, the result is the best posterior solution.  Note you only need to name the Curve once in 'Options' if all dates use the same curve.
 
I've also tried it with three contiguous phases, i.e the boundaries between each is a transition, 1/2 etc.
 
 Plot()

 {
  Curve("ShCal13","ShCal13.14c");
  Sequence("model 2")
  {
   Boundary("start 1");
   Phase("IBP_clearance")
   {
    R_Date("NZA60321", 360, 22);
    R_Date("NZA60799", 387, 19);
    R_Date("NZA60802", 381, 19);
   };
   Boundary("Transition 1/2");
   Phase("F46_midden")

   {
    R_Date("NZA60221", 316, 21);
    R_Date("NZA60800", 279, 20);
   };
   Boundary("Transition 2/3");

   Phase("midden I & J")
   {
    R_Date("Wk-45431", 335, 16);
    R_Date("Wk-45432", 298, 15);
    R_Date("Wk-45433", 256, 17);
    R_Date("Wk-45434", 290, 15);
    R_Date("Wk-45435", 298, 16);
   };
   Boundary("End 3");
  };
 };
This gives A model of 83, so again an acceptable model.  Comparing the F model index it would seem to be a better model.  (View - Model specification - F-model).
 
I don't think the boundary 'squeezes up' the dates.  The Bayes paradigm takes the date pdfs and, given the model prior information, (i.e. that they are in a sequence if three phases) calculates the posterior solution.  You can choose other prior information, e.g. type of boundary, curve, offsets etc, but really only if you have evidence to support it.
 
If you are uneasy with the fact that the posterior seems so 'squeezed up', try a model with Simulated dates to confirm the efficacy of the model.
 
Best wishes
 
Ray Kidd
 
 
--
You received this message because you are subscribed to the Google Groups "OxCal" group.
To unsubscribe from this group and stop receiving emails from it, send an email to oxcal+un...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Andrew

unread,
Aug 10, 2017, 1:44:12 AM8/10/17
to OxCal
Hi Ray,

Thanks for taking the time to look at this. Ive run the code you pasted and see it has a similar result. 
Ive used Oxcal before but in most situations we have so few dates and so few sites with stratigraphy in New Zealand that im not proficient in the model options available. 

I may be repeating myself, apologies if I am, but can you help me understand how under this model the 'Boundary start 1' PDF has been established? 

If you can follow my argument, suppose for example that I can bring no other information to the model other than knowledge that NZ was settled in 1320AD, and I have no reason to believe the Phase IBP_clearance dates could not be as early as the earliest portion of the three dates' PDF's, then why does the  'Boundary start 1' PDF not allow for that? In other words, are there other boundary options that could be used to 'loosen' the initial 'Boundary start 1' PDF ?

Also, there is a period of time of unknown length I would like to define somehow, that relates to a period of intensive horticultural activity, which falls between the 'Phase F46_midden' dates and the dates for the last phase 'midden I and J". What can I add into the model to try establish a likely period for this activity?

Kind regards,
Andrew Hoffmann.

Christian Hamann

unread,
Aug 10, 2017, 2:45:44 AM8/10/17
to OxCal
Hi Andrew,

Setting the 'UniformSpanPrior' to false might give you what you are looking for. Just insert this snippet before your plot() command:

 Options()
 {
  UniformSpanPrior=FALSE;
 };

The math behind the uniform span can be found here:

https://c14.arch.ox.ac.uk/oxcal3/math_gi.htm#boundary

@Christopher: There is a small typo in the heading of the corresponding subsection (Additional factors for unform overall span).

Best regards,
Christian

Andrew

unread,
Aug 10, 2017, 4:24:56 PM8/10/17
to OxCal
Thanks Christian,

I tried that option and it does seem to provide the answer.
The more I look into this as a non-mathematician the more I question!
Cheers.
Andrew.

Rayfo...@aol.com

unread,
Aug 10, 2017, 4:29:37 PM8/10/17
to ox...@googlegroups.com
Hi Andrew H,
 
Plenty maths via Christian's link to V 3.1.  If you have a look at View - Plot on Curve there is a very large curve inversion that is responsible for the early spread of dates.  The Uniform spread prior default of True pushes the boundary beyond this.
 
If you have information of TAQ/TPQ then you can put in a Before or After command as suits.
 
If you think there is a horticultural activity somewhere, you can put an Interval() command in, say between Boundary end 2 and boundary start 3, which gives  a probable interval of about 0-10 years.  The Difference command is also available.
 
regards
 
Ray

Andrew

unread,
Aug 11, 2017, 2:42:57 AM8/11/17
to OxCal
Thanks Ray,

Yes I see that. The calibration curve inversion for that period creates real problems interpreting sites in NZ! It spans such a crucial time between Golson's early 'archaic' to late 'classic' periods.

Question 1-  how to know which 'Uniform span prior' setting TRUE or FALSE would provide a better representation of the archaeological context ? 

Question 2 - I have a related modelling question for which Ive attached the CODE 1 below. Once MCMC has run the table data shows that within the model two dates have "Warning poor agreement", but the model is still acceptable?  How does that work?

Question 3 - Similarly, with CODE 2 below, I have used Combine on several dates from two separate contexts that im testing if my assumption they are contemporary is refuted. The table data result shows that there is again WARNING Poor Agreement between the context, but  the overall model passes. To what extent can I accept that my assumption holds? How can I account for the poor agreement and accept the model?


Really hoping you may have some insight into these rather deep questions about maths interpreting archaeology...

Cheers

CODE 1 
Plot()
 {
  Curve("ShCal13","ShCal13.14c");
  Sequence("model 4")
  {
   Boundary("start 1");
   Phase("IBP_clearance")
   {
    R_Date("NZA60321", 360, 22);
    R_Date("NZA60799", 387, 19);
    R_Date("NZA60802", 381, 19);
   };
   Boundary("Transition 1/2");
   Phase("F46_midden")
   {
    R_Date("NZA60221", 316, 21);
    R_Date("NZA60800", 279, 20);
   };
   Boundary("Transition 2/3");
   Phase("midden I & J")
   {
    R_Date("Wk-45431", 335, 16);
    R_Date("Wk-45432", 298, 15);
    R_Date("Wk-45433", 256, 17);
    R_Date("Wk-45434", 290, 15);
    R_Date("Wk-45435", 298, 16);
   };
   Boundary("End 3");
  };
 };


CODE 2
 Options()
 {
  Curve="ShCal13.14c";
 };
 Plot()
 {
  Combine("test_combine_paleosol_underF46_+_Area_19-2")
  {
   R_Date("NZA60318_Area_19-2", 322, 22);
   R_Date("NZA60321_Tr30", 360, 22);
   R_Date("NZA60799_Tr30", 387, 19);
   R_Date("NZA60802_Tr30", 381, 19);
  };
 };

Rayfo...@aol.com

unread,
Aug 11, 2017, 7:06:22 AM8/11/17
to ox...@googlegroups.com
Hi Andrew H,
 
first the caveat:  I don't count myself expert, but have the luxury of time.
 
Q1.  The Options explanation is given as
 
UniformSpanPrior TRUE | FALSE TRUE Whether the two extra prior factors suggested by Nicholls and Jones 2001 are used
So presumably the paper referred to will give an insight as to which and when may be used.  You can also put an expression in the boundary such as:
 
   Boundary("Start 1",Date(U(1550,1800)));  If you can justify it archaeologically. e.g. is it realistic that Phase 1 uncertainty would be in the range 1300 to 1650 or 1480 to 1650 given the large curve inversion?  In one instance the span of the dates is about 120 years (mean 50) and the other about 70 years (mean 20).  You could show both and argue the case.
 
 
 
Q2.  Using the common uncertainty range of 95.4% implies that about 1 in 20 calibrations will lie outside the range by chance.  We do not have the luxury of knowing which these will be, but when a calibrated pdf lies somewhat beyond the likelihood pdf, then a warning is produced.  The complete model may still be acceptable, and if the A model index is above 60, it is deemed so.
If many Individual index create warnings, then the A model index will probably show rejection.  Time then to examine the model.  Is there a case for outlier analysis?  Can some dates be intrusive? etc.
 
Q3.  I think in this case the Combine test shows that they are probably contemporary.  The test is shown as
test_combine_paleosol_underF46_+_Area_19-2 Combine()
 X2-Test: df=3 T=4.297(5% 7.815)
  68.2% probability
    1505AD ( 9.1%) 1513AD
    1546AD (52.1%) 1589AD
    1617AD ( 7.1%) 1623AD
  95.4% probability
    1501AD (82.5%) 1595AD
    1612AD (12.9%) 1628AD
 Agreement n=4 Acomb= 69.6%(An= 35.4%)
 
(click on the table then the bar icon in the second column)
If the Acomb= was less than An= then the test would be flagged as failed.  What you have is one of the 
posterior dates individual index as 57 but the pdf of the combination passes the test.
 
Hope this helps
 
regards
 
Ray
 

Christopher Ramsey

unread,
Aug 11, 2017, 11:52:07 AM8/11/17
to ox...@googlegroups.com

> On 11 Aug 2017, at 07:42, Andrew <ajarch...@gmail.com> wrote:
>
> Thanks Ray,
>
> Yes I see that. The calibration curve inversion for that period creates real problems interpreting sites in NZ! It spans such a crucial time between Golson's early 'archaic' to late 'classic' periods.
>
> Question 1- how to know which 'Uniform span prior' setting TRUE or FALSE would provide a better representation of the archaeological context ?

The default for this setting is TRUE and unless you specifically want to avoid the normal priors you should leave it unspecified.

>
> Question 2 - I have a related modelling question for which Ive attached the CODE 1 below. Once MCMC has run the table data shows that within the model two dates have "Warning poor agreement", but the model is still acceptable? How does that work?
>

Roughly 1 in 20 dates might randomly give poor agreement. If the overall model agreement is ok then you can accept the model. In practice if the level for any one point is low enough it will pull Amodel down anyway.

> Question 3 - Similarly, with CODE 2 below, I have used Combine on several dates from two separate contexts that im testing if my assumption they are contemporary is refuted. The table data result shows that there is again WARNING Poor Agreement between the context, but the overall model passes. To what extent can I accept that my assumption holds? How can I account for the poor agreement and accept the model?

Again with the second code - the overall Acomb is ok and so the fact that one of the individual agreement indices is just below 60 is not significant.

Best wishes

Christopher

Andrew

unread,
Aug 14, 2017, 6:03:00 PM8/14/17
to OxCal
Hi Ray. Thanks so much for your time. Most appreciated. 
Reply all
Reply to author
Forward
0 new messages