Outlier Models

661 views
Skip to first unread message

Rayfo...@aol.com

unread,
Jul 24, 2009, 3:09:45 PM7/24/09
to ox...@googlegroups.com

Dear Christopher,

 

Can you offer some guidance here?

 

In OxCal V4.1 Interface build 46 the help file on Outlier analysis notes that SSimple closely follows the model of Christen 1994 i.e.

 

Outlier_Model("SSimple",N(0,2),0,"s").

 

This is applied to the Chancay culture dates with the addition of a Fake value of 1400+/-70 to illustrate the outlier analysis and includes late date of 1550 AD.  The outlier analysis shifts this Fake date from a prior of 0.1 to a posterior of .39.  I have tried to simulate the Christen analysis as follows (SSimple below):

 

with the following in broad agreement with Christen’s result.  ‘Dum’ is the Fake date and is flagged as poor individual agreement of 34% and A model 72%.  Posterior probability of ‘Dum’ as an outlier shifts from 10% to 50%

 

However, Christen also notes that the 13 Chancay determinations on table 1 are all from Charcoal samples (page 493).

 

If I then repeat the analysis but use the Charcoal outlier model as below (the Charcoal model):

 

This gives Amodel of 99% and the ‘Dum’ fake an individual agreement of 65%, which, without the knowledge that it was a fake determination, would not suggest it was an outlier.

 

My take on this is that the 1400+/-70 fake determination is not an outlier when we consider that the other 13 determinations are from charcoal and hence could be residual, but that the 1400+/-70 would be an outlier had the other 13 determinations been non-charcoal.

 

Treating the 13 determinations simply as a non charcoal phase with ‘Dum’ an outlier (question) gives it a 1% probability of it belonging to the phase.

 

Was the Fake determination an unfortunate illustration given the circumstances?

 

Regards Ray Kidd

 

 

 

 

The SSimple Model

 

Plot()

 {

  Outlier_Model("SSimple",N(0,2),0,"s");

  Sequence()

  {

   Boundary("Start 1");

   Phase("1")

   {

    R_Date("Dum", 1400, 70)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5824", 1140, 50)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd6189", 1070, 60)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5310", 1000, 50)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5307", 970, 50)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5309", 910, 35)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd6197", 900, 70)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5672", 830, 50)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd6196", 810, 70)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5823", 670, 40)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd2818", 520, 60)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5304", 460, 50)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd3396", 430, 30)

    {

     Outlier("SSimple", .1);

    };

    R_Date("Gd5312", 390, 45)

    {

     Outlier("SSimple", .1);

    };

   };

   Boundary("End 1");

   Date("T_1550", 1550);

  };

 };

 

 

The Charcoal model

 

Plot()

 {

  Outlier_Model("Charcoal",Exp(1,-10,0),U(0,3),"t");

  Sequence()

  {

   Boundary("Start 1");

   Phase("1")

   {

    R_Date("Dum", 1400, 70)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5824", 1140, 50)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd6189", 1070, 60)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5310", 1000, 50)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5307", 970, 50)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5309", 910, 35)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd6197", 900, 70)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5672", 830, 50)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd6196", 810, 70)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5823", 670, 40)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd2818", 520, 60)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5304", 460, 50)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd3396", 430, 30)

    {

     Outlier("  Charcoal", 1);

    };

    R_Date("Gd5312", 390, 45)

    {

     Outlier("  Charcoal", 1);

    };

   };

   Boundary("End 1");

   Date("T1550", 1550);

  };

 };

 

 

Christopher Ramsey

unread,
Jul 27, 2009, 9:25:42 AM7/27/09
to ox...@googlegroups.com
Ray

The charcoal model is very different in nature. In this model it is
assumed that all of the dates are outliers - that is they are all
wrong relative to their deposition date - since wood takes some time
to grow and therefore all charcoal dates should be older than their
context. The difference between the date of growth and deposition for
each piece of charcoal os modelled to be drawn from an exponential
distribution with an unknown time-constant (whose value is a parameter
of the model). So in this case the sample 'Dum' is just a rather
older piece of charcoal than the others.

Christopher

On 24 Jul 2009, at 20:09, Rayfo...@aol.com wrote:

> Dear Christopher,
>
> Can you offer some guidance here?
>
> In OxCal V4.1 Interface build 46 the help file on Outlier analysis
> notes that SSimple closely follows the model of Christen 1994 i.e.
>
> Outlier_Model("SSimple",N(0,2),0,"s").
>
> This is applied to the Chancay culture dates with the addition of a
> Fake value of 1400+/-70 to illustrate the outlier analysis and
> includes late date of 1550 AD. The outlier analysis shifts this
> Fake date from a prior of 0.1 to a posterior of .39. I have tried
> to simulate the Christen analysis as follows (SSimple below):
>
> with the following in broad agreement with Christen’s result. ‘Dum’
> is the Fake date and is flagged as poor individual agreement of 34%
> and A model 72%. Posterior probability of ‘Dum’ as an outlier
> shifts from 10% to 50%
> <clip_image002.jpg>

Rayfo...@aol.com

unread,
Jul 27, 2009, 12:48:09 PM7/27/09
to ox...@googlegroups.com
Thanks Christopher,
 
my difficulty was that all the R_Dates are said to be from charcoal and I therefore supposed that the Charcoal model would be appropriate.
 
The Christen model does not seem to address the charcoal issue, but there again it did not set out to do so, as it was designed to be an example of outlier detection.
 
For the record, I used DetSplit on the dates.  This identified the 1400BP outlier and also (significant) splits as follows:
 
1140, 1070, 1000, 970
 
Split
 
910, 900, 830, 810
 
Split
 
670
 
Split
 
520, 460, 430, 390
 
Treating these splits as phase transitions and again using the Charcoal model as before results in a model not dissimilar to the SSimple model, except that the transitions now result in narrower posterior distributions for the individual dates, possibly as the transitions are providing additional constraints in the model.
 
Do you consider this use of DetSplit prior to building the OxCal model a valid thing to do or are there reasons why it should not be done?
 
regards
 
Ray Kidd
 
Model: ChristenChancayCharcoal1550PHtrans.oxcal
 
Plot()
 {
  Outlier_Model("Charcoal",Exp(1,-10,0),U(0,3),"t");
  Sequence()
  {
   Boundary("Start 1");
   Phase("1")
   {
    R_Date("Gd5824", 1140, 50)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd6189", 1070, 60)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd5310", 1000, 50)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd5307", 970, 50)
    {
     Outlier("  Charcoal", 1);
    };
   };
   Boundary("Transition 1/2");
   Phase("2")
   {

    R_Date("Gd5309", 910, 35)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd6197", 900, 70)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd5672", 830, 50)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd6196", 810, 70)
    {
     Outlier("  Charcoal", 1);
    };
   };
   Boundary("Transition 2/3");
   Phase("3")
   {

    R_Date("Gd5823", 670, 40)
    {
     Outlier("  Charcoal", 1);
    };
   };
   Boundary("Transition 3/4");
   Phase("4")
   {

    R_Date("Gd2818", 520, 60)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd5304", 460, 50)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd3396", 430, 30)
    {
     Outlier("  Charcoal", 1);
    };
    R_Date("Gd5312", 390, 45)
    {
     Outlier("  Charcoal", 1);
    };
   };
   Boundary("End");
   Date("T_1550", 1550);
  };
 };
 

Christopher Ramsey

unread,
Jul 31, 2009, 11:25:05 AM7/31/09
to ox...@googlegroups.com
Ray

That is reasonable - I think this is appropriate. The only problem
is, if you accept a model where all dates are expected to be earlier
than the dates of their deposition, the concept of an outlier no
longer applies as they are all outliers in this sense (just by varying
degrees). The important think with this model is that the odd really
early date won't matter at all as it will just be treated as a
terminus post quem - and it is a perfectly good date for that purpose.

There may be merits in using classical statistical methods to sort out
which dates are different from the group (as you have done with
DetSplit - but I think this should be seen as an alternative approach
to Bayesian outlier methods - I think mixing the two very different
approaches is not likely to be productive.

Christopher

Rayfo...@aol.com

unread,
Jul 31, 2009, 11:57:07 AM7/31/09
to ox...@googlegroups.com
Many thanks Christopher,
 
your reply has been a great help.   I would not mix DetSplit.exe with OxCal as a matter of course, but it is a quick and useful tool get an alternative view in some circumstances, as long as I keep in mind the limitations.
 
regards
 
Ray Kidd
Reply all
Reply to author
Forward
0 new messages