LinearModelFit (and similar statistical functions) are documented with
examples such as:
data = {{0, 1}, {1, 0}, {3, 2}, {5, 4}};
lm = LinearModelFit[data, x, x]
which returns a FittedModel 'object'. One can then execute code such
as lm["FitResiduals"] or lm["RSquared"] to retrieve the properties one
desires.
1. How exactly does Mathematica generate/store the objects associated
with the head 'lm'?!? I'd *really* appreciate a simple example of how
to do this and/or some tips on when/how to use more complicated head
structures.
2. And taking that one step further, how does Mathematica generate/
store additional info, such as 'Descriptions' on specific properties?
For example, lm["ParameterErrors", "LongDescription"], returns
"standard errors for parameter estimates".
3. Does Mathematica immediately calculate and store all these
lm{"objects"], or are they generated (where appropriate) only when
requested? (And is this really OOP? <-- not intended to start an OOP
flame war!)
4. (Possibly related, so I figured I'd include it...) How does
Mathematica store all the metadata available on example data sets?---
Is there documentation on how to conform to this standard/approach?
For instance: ExampleData[{"Matrix","FIDAP007"}, "Properties"]. (on
second thought, I guess this is probably stored in some standard file
format, hidden from view...)
Thanks for any info and guidance you can provide,
-RG
> LinearModelFit (and similar statistical functions) are documented with
> examples such as:
>
> data = {{0, 1}, {1, 0}, {3, 2}, {5, 4}};
> lm = LinearModelFit[data, x, x]
>
> which returns a FittedModel 'object'. One can then execute code such
> as lm["FitResiduals"] or lm["RSquared"] to retrieve the properties one
> desires.
>
> 1. How exactly does Mathematica generate/store the objects associated
> with the head 'lm'?!? I'd *really* appreciate a simple example of how
> to do this and/or some tips on when/how to use more complicated head
> structures.
>
> 2. And taking that one step further, how does Mathematica generate/
> store additional info, such as 'Descriptions' on specific properties?
> For example, lm["ParameterErrors", "LongDescription"], returns
> "standard errors for parameter estimates".
The variable lm is set to an expression with head FittedModel, whose
arguments contain the data necessary to describe the fitted model. It is
the SubValues of the symbol FittedModel that contain all the definitions
for the various properties. It's not easy to read but you can look at
all these definitions with SubValues[FittedModel].
> 3. Does Mathematica immediately calculate and store all these
> lm{"objects"], or are they generated (where appropriate) only when
> requested?
>From what I can see in the SubValues, it looks like most of it is
generated on request from the data that is stored in the FittedModel. My
guess is that they store everything that takes a while to generate as
data in the FittedModel and generate stuff that is fast and comes in
many variants on the fly.
> (And is this really OOP? <-- not intended to start an OOP
> flame war!)
I would say this makes use of many aspects that you also find in OOP.
> 4. (Possibly related, so I figured I'd include it...) How does
> Mathematica store all the metadata available on example data sets?---
> Is there documentation on how to conform to this standard/approach?
> For instance: ExampleData[{"Matrix","FIDAP007"}, "Properties"]. (on
> second thought, I guess this is probably stored in some standard file
> format, hidden from view...)
I have no insight on how the Data-functions are implemented, but they
typically do get their data as wdx-files from the Wolfram servers and
store/cache these in local directories, something like:
FileNames["*",
FileNameJoin[{$UserBaseDirectory, "Paclets", "Repository"}]]
Basically you could try to import these files and see what is in them.
Again that is not an easy read but I think one could do some
reengineering of how they possibly work from the content.
hth,
albert
Albert is correct that the definitions are attached to a FittedModel
pattern, lm is a FittedModel expression, FittedModel objects store
enough information for the internal code to compute results without
having to re-do the fitting, and this is a type of object oriented
programming. I just wanted to follow up a bit and provide a short
example to demonstrate how this type of behavior can be accomplished.
For demonstration purposes, myWrapper will be the head for our object
(playing an analogous role to that of FittedModel) and makeWrapper will
be the constructor function (analogous to *ModelFit functions for
FittedModel) which makes myWrapper objects. The object will only contain
a list of values and a variable.
These define short descriptions for three properties:
In[1]:= myWrapper /:
myWrapper[_List, _Symbol]["Values", "Description"] := "list of values"
In[2]:= myWrapper /:
myWrapper[_List, _Symbol]["Variable", "Description"] := "variable"
In[3]:= myWrapper /:
myWrapper[_List, _Symbol]["Product",
"Description"] := "product of values and variable"
These define definitions for the properties:
In[4]:= myWrapper /: myWrapper[vals_List, var_Symbol]["Values"] := vals
In[5]:= myWrapper /:
myWrapper[vals_List, var_Symbol]["Variable"] := var
In[6]:= myWrapper /: myWrapper[vals_List, var_Symbol]["Product"] :=
vals*var
This defines the constructor function, which just takes its first two
arguments and puts them in myWrapper:
In[7]:= makeWrapper[data_List, var_Symbol, opts___] :=
myWrapper[data, var]
Now from a constructed myWrapper object, we can get the defined
properties and descriptions:
In[8]:= res = makeWrapper[Range[10], x, argThatWillBeIgnored]
Out[8]= myWrapper[{1, 2, 3, 4, 5, 6, 7, 8, 9, 10}, x]
In[9]:= res["Values"]
Out[9]= {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
In[10]:= res["Variable"]
Out[10]= x
In[11]:= res["Product"]
Out[11]= {x, 2 x, 3 x, 4 x, 5 x, 6 x, 7 x, 8 x, 9 x, 10 x}
In[12]:= Table[
res[i, "Description"], {i, {"Values", "Variable", "Product"}}]
Out[12]= {"list of values", "variable", "product of values and \
variable"}
What the FittedModel code does is more complicated because it has a lot
more to deal with, but this example shows the basic ideas behind how it
works.
Darren Glosemeyer
Wolfram Research
Application of FullForm shows that lm is actually:
FittedModel[List["Linear",List[0.18644067796610198`,
0.6949152542372878`],List[List[x],List[1,x]],List[0,0]],List[List[1.`,
1.`,1.`,
1.`]],List[List[0,1],List[1,0],List[3,2],List[5,4]],List[List[1.`,
0.`],List[1.`,1.`],List[1.`,3.`],List[1.`,
5.`]],Function[Null,Internal`LocalizedBlock[List[x],Slot[1]],List[HoldAll]]]
Similar output is obtained for the other fit functions. You can see
that the data, the model, the fit and the type of fit are stored as
FittedModel parameters and that's about all there is.
The special formatting of the FittedModel output is probably done by
means of a Format function like in:
myModal/: Format[myModel[a_, b_]] := myModel[Panel[a[b]]]
myModel[tata, titi]
Properties can be easily defined in the following way:
FittedModel[a_, b_, c_, d_]["FitResiduals"] := FittedModelResiduals[a,
b, c, d]
FittedModel[a_, b_, c_, d_]["BestFit"] := FittedModelBestFit[a, b, c,
d]
FittedModel[a_, b_, c_, d_]["ParameterErrors","LongDescription"] :=
"standard errors for parameter estimates"
etc. etc. Of course, you then have to define FittedModelResiduals,
FittedModelBestFit to do something useful with the input.
Alternatively, the functions could also be defined within FittedModel
as
FittedModel[a_, b_, c_, d_]["FitResiduals"] :=
FittedModel["FitResiduals",a, b, c, d],
FittedModel[a_, b_, c_, d_]["BestFit"] := FittedModel["BestFit",a, b,
c, d]
or so
If you change the y data in the FullForm version of lm into variables
and ask for fit residuals you'll see it's just calculating the
residuals with the data present in the FittedModel parameters:
In[90]:= FittedModel[List["Linear",List[0.18644067796610198`,
0.6949152542372878`],List[List[x],List[1,x]],List[0,0]],List[List[1.`,
1.`,1.`,
1.`]],List[List[0,1],List[1,0],List[3,2],List[5,4]],List[List[1.`,a],List[1.`,b],List[1.`,c],List[1.`,d]],Function[Null,Internal`LocalizedBlock[List[x],Slot[1]],List[HoldAll]]]
["FitResiduals"]
Out[90]= {0.813559- 0.694915 a, -0.186441 - 0.694915 b, 1.81356-
0.694915 c, 3.81356- 0.694915 d}
You see? It's actually quite simple. FittedModel[a_, b_, c_, d_]
doesn't do anything; it just sits there holding its parameters. There
are only definitions for FittedModel[a_, b_, c_, d_][...] type of
inputs.
Cheers -- Sjoerd