Download OCCAM - Systems Science Graduate Program

Transcript
Occam User’s Manual
5/12/2012
17
ties or filling in missing data. If it too fails to specify a prediction, Occam will fall back to
the independence model.
VII. Fit Output
After echoing the input parameters (which are requested by default), Occam prints out
some properties of the model and some measures for the model where the reference
model is first the top and then the bottom of the lattice.
Output file for a directed system
Below is a sample output for the same example data used in the Search chapters. The
model being fit is the top model, “ABC”, where A and B are IVs, and C is the DV. The
first columns show all of the “IV” state combinations that appear in the data. The next
three columns, marked “Data”, show the frequencies in the data for each of those IV
states, along with the observed conditional probabilities for the DV states. The following
columns show the calculated conditional probabilities for the model, along with the
selected prediction rule. The last columns show the performance of those rules on the
data.
IV
A
0
0
1
1
Data
B
0
1
0
1
freq
396
259
638
185
1478
freq
obs. p(DV|IV)
C=0
C=1
36.111 63.889
29.730 70.270
35.580 64.420
24.865 75.135
33.356 66.644
C=0
C=1
Model
calc. q(DV|IV)
C=0
C=1
36.111 63.889
29.730 70.270
35.580 64.420
24.865 75.135
33.356 66.644
C=0
C=1
rule
1
1
1
1
1
rule
#correct
253
182
411
139
985
#correct
%correct
63.889
70.270
64.420
75.135
66.644
%correct
At the bottom of the table, Occam prints out a summary row including the marginal
frequencies of the DV states, also expressed as percentages. Under the “rule” column for
the Model, the summary row includes the default rule for the data. This default rule is
based on the most common DV value. (In cases of ties, the tie is broken by alphanumeric
order. For example: if a DV has two states “0” and “1” that appear with equal frequency,
the default rule would be “0”.)
If the input file also contains test data, there will be additional columns to the right,
showing the performance of the model rules. Below the table, Occam also outputs a brief
summary of the model’s test performance. This summary compares the model to the
default rule and to the “best possible” rule set. A percent improvement is given, showing
how the model performed, scaled between the default and best possible outcomes.
Output file for a neutral system
For neutral systems, Occam prints out the observed and calculated probability for every
cell, and the difference between the two (the residual). It also prints out the observed and
calculated frequencies for convenience. Below is an example table, using the same
sample data as above, with the variable C set to be an IV. The model being fit is “A:BC”.
The first column is the observed states of the IVs. The next columns are Observed and
Calculated probabilities and frequencies for each state, and then the Residuals.