Identifying Distinct Occupations

236 views
Skip to first unread message

Dayna

unread,
Mar 2, 2015, 2:10:02 PM3/2/15
to ox...@googlegroups.com
Hello,

  This is my first time using OxCal and I do not have a strong coding background, so I am struggling a little with some of the analysis functions.

  I am looking at radiocarbon dates for a large number of sites to determine the relative frequency of site revisits (determine whether or not a pattern of persistent use exists for the region).  Some of the sites have radiocarbon dates that are very close in age.  I want to know if these dates likely represent a single occupation, or two distinct occupations.

  Is this where I use the R_Combine?  Is there a guide for how to write a code for this type of analysis?

  Thanks for any help!!

   Dayna

Erik

unread,
Mar 6, 2015, 7:27:50 AM3/6/15
to ox...@googlegroups.com
Hi Dayna—

The spatial and stratigraphic relationships between the dates are key to choosing commands and setting up a model in OxCal.
Are the dates you want to compare from the same site or different sites? What do you mean (temporally and spatially) by a single "occupation"? Of the whole region?

You can compare two dates to see if there probability ranges overlap. Check the manual for the Order and Difference commands.
R_Combine is used when you have two dates from the same thing that should have the same date, for example, two bones from the same individual.

Erik

MILLARD A.R.

unread,
Mar 7, 2015, 6:53:09 AM3/7/15
to ox...@googlegroups.com
> From: Dayna
> Sent: 02 March 2015 15:22
>
> I am looking at radiocarbon dates for a large number of sites to
> determine the relative frequency of site revisits (determine whether or
> not a pattern of persistent use exists for the region). Some of the
> sites have radiocarbon dates that are very close in age. I want to know
> if these dates likely represent a single occupation, or two distinct
> occupations.

I think this is essentially the same question as Allen W asked on 25 December, and I suspect he had no reply. (His question was whether breaks in occupation of sites could be inferred from gaps in a corpus of radiocarbon dates.) While there are methods implemented in OxCal to estimate the length of a gap between two occupations given that we know *a priori* how to group dates into occupations, there is no formal method in OxCal to determine the number of occupations and gaps. It would be possible to express this problem mathematically but I am not aware of any published methodology. One option might be a change-point analysis on the presence/absence of occupation and how many times it switched. An alternative might be to regard this as a model-choice problem, where each model is series of disjoint phases and we want to know the probabilities of models with different numbers of phases. The problem here is that the number of theoretically possible models rises very quickly with the number of dates: even if all the dates are known to be in order, 4 dates have 8 possible models, but 6 dates have 31. The number of models might be reduced by archaeological reasoning, for example, requiring a minimum length for a gap, or if the question is about whether a sterile layer represents a significant period without occupation.

If you have only a few models that you want to compare then I believe it might be possible to do this formally using OxCal and a few extra calculations, though I have not seen this done, and perhaps others could comment. According to Bronk Ramsey (2009), the A_model is a transformation of F_model and F_model is an approximation to a pseudo-Bayes factor. Bayes Factors can be used to compare models. F_model compares a given model to a null model with no constraints. So for all models incorporating exactly the same set of radiocarbon dates, the null model is exactly the same. Thus using that F_model = (A_model/100)^sqrt(n) where n is the number of dates, for two models A and B, we have that F_modelA/F_modelB =(A_modelA/AmodelB)^sqrt(n) and this can be used as an approximation to a Bayes Factor between the two models. Bayes Factors greater than 5 or 10 are generally reckoned to be strong evidence to prefer one model over another.

For four dates W,X,Y,Z we could set up two models and compare them to see if one occupation or two occupations with a gap between X and Y are more likely:

//Model A
Sequence{
Boundary();
R_date(W);
R_date(X);
R_date(Y);
R_date(Z);
Boundary();
};

//Model B
Sequence{
Boundary();
R_date(W);
R_date(X);
Boundary();
Boundary();
R_date(Y);
R_date(Z);
Boundary();
};

I haven't tried to implement this even on a toy example, but what do others think?

Reference
Bronk Ramsey C. 2009. Bayesian analysis of radiocarbon dates. Radiocarbon 51:337-360.


Best wishes

Andrew
--
 Dr. Andrew Millard 
e: A.R.M...@durham.ac.uk | t: +44 191 334 1147
 w: http://www.dur.ac.uk/archaeology/staff/?id=160
 Senior Lecturer in Archaeology, Durham University, UK


Ben

unread,
May 19, 2015, 7:01:10 AM5/19/15
to ox...@googlegroups.com
Hi Andrew,

I'm quite interested in how this would be done in OxCal. I have a similar question, six calendar dates and I want to know if I should consider them a single phase or if there is a gap between them. I've come up with four possible models, and got A_model scores. The model scheme is meant to be like this, where 1-3 are phases, and each row is a date (each column is a model):

A, B, C, D
1, 1, 1, 1
1, 1, 1, 1, 
1, 1, 2, 2, 
1, 2, 2, 3, 
1, 2, 2, 3, 

Could I ask for your feedback on if I have correctly implemented these models in the code below? And am I interpreting the A_model values correctly to indicate that model A is the best one in this case? (I took inspriation from this Q&A: https://groups.google.com/d/msg/oxcal/YRGxmMKWIIo/lOUbzBsP7_oJ). I would like to compute likelihood ratio between pairs of my models to compare with some published results. I've had a go at the pseudo-Bayes Factors using your instructions above also (my results below). I hope they're on the right track. I'd be most grateful for any guidance. 

thanks,

Ben

Here are the models:


  ################## A ########### A=99
 
 Plot()
 {
  Sequence()
  {
   Boundary("Start 1");
   Phase("1")
   {
  C_Date("FC455", 187, 101);
  C_Date("FC486", 392, 58);
  C_Date("FC29", 822, 61);
  C_Date("FC20", 1789, 75);
  C_Date("FC460", 1916, 66);
   };
   Boundary("End 1");
  };

 };
 
 
  ################## B ########### A=97.1
 
 Plot()
 {
  Sequence()
  {
   Boundary("Start 1");
   Phase("1")
   {
    C_Date("FC455", 187, 101);
    C_Date("FC486", 392, 58);
    C_Date("FC29", 822, 61);
   };
   Boundary("trans");
   Phase("2")
   {
    C_Date("FC20", 1789, 75);
    C_Date("FC460", 1916, 66);
   };
   Boundary("End 1");
  };
 };

 
 ################## C ########### A=93.8
 Plot()
 {
  Sequence()
  {
   Boundary("Start 1");
   Phase("1")
   {
    C_Date("FC455", 187, 101);
    C_Date("FC486", 392, 58);
   };
   Boundary("trans");
   Phase("2")
   {
    C_Date("FC29", 822, 61);
    C_Date("FC20", 1789, 75);
    C_Date("FC460", 1916, 66);
   };
   Boundary("End 1");
  };
 };


 ################## D ########### A=90.2
  Plot()
 {
  Sequence()
  {
   Boundary("Start 1");
   Phase("1")
   {
    C_Date("FC455", 187, 101);
    C_Date("FC486", 392, 58);
   };
   Boundary("trans1");
   Phase("2")
   {
    C_Date("FC29", 822, 61);
   };
   Boundary("trans2");
   Phase("3")
   {
    C_Date("FC20", 1789, 75);
    C_Date("FC460", 1916, 66);
   };
   Boundary("End 1");
  };
 };

Here's where I compute the Bayes Factors

##### Pseudo-Bayes Factors  (using R here) ###

# From OxCal
A_A=99
B_A=97.1
C_A=93.8
D_A=90.2

# n= 6 dates for all models

F_model_A <- (A_A/100)^sqrt(6)
F_model_B <- (B_A/100)^sqrt(6)
F_model_C <- (C_A/100)^sqrt(6)
F_model_D <- (D_A/100)^sqrt(6)

# compute F_model as an approximation to a pseudo-Bayes factor

> (F_model_A / F_model_B) ^ sqrt(6)
[1] 1.1233
> (F_model_A / F_model_B) ^ sqrt(6)
[1] 1.1233
> (F_model_A / F_model_C) ^ sqrt(6)
[1] 1.382274
> (F_model_A / F_model_D) ^ sqrt(6)
[1] 1.748123
> (F_model_B / F_model_C) ^ sqrt(6)
[1] 1.230547
> (F_model_B / F_model_D) ^ sqrt(6)
[1] 1.556238
> (F_model_C / F_model_D) ^ sqrt(6)
[1] 1.264672

# Conclusion: Looks like A:D is the most extreme, gaps do exist in the dates.

Is that basically on the right track?
Reply all
Reply to author
Forward
0 new messages