year subject_count 1 42736 2 27146 3 23328 4 19795 5 16898 6 14061 7 11530 8 9155 9 7141 10 5260 11 3770 12 2524 13 1727 14 1140 15 771 16 582 17 374 18 166 19 57 20 11 21 6 22 5 23 4 24 2 25 5 26 1
Autoregressive fucntion by writing my own dist_func?artist_name year closeness_event kcore_event betweenness_event \ 0 colin dale 1993 0.575553 -0.848016 -0.078891 1 jeff mills 1993 0.581378 -0.849989 -0.070038 2 paul van dyk 1993 0.571694 -0.856506 -0.162971 3 robert armani 1996 -3.607296 -0.845350 65.450321 4 claudio coccoluto 1998 -3.319848 -0.865555 -0.127938 clustering_coeff_event career_age release_count travel_dist \ 0 1.183177 1 -0.569684 -0.383515 1 1.176884 4 0.072075 -0.379455 2 1.188049 5 -0.304109 -0.385076 3 1.172376 6 -0.568790 -0.382209 4 -2.937427 7 0.085921 -0.381167 past_success decade 0 -0.339579 1.0 1 6.717467 1.0 2 6.695121 1.0 3 -0.353896 1.0 4 -0.026755 2.0
formula ="travel_dist~C(decade,Treatment(reference=1))+closeness_event+kcore_event+betweenness_event+clustering_coeff_event+career_age+release_count+past_success"mod = GEE.from_formula(formula, "artist_name", df,groups=df['artist_name'],family=Gaussian(),time='career_age', cov_struct=Exchangeable(),missing='drop')GEE Regression Results =================================================================================== Dep. Variable: travel_dist No. Observations: 169101 Model: GEE No. clusters: 21020 Method: Generalized Min. cluster size: 5 Estimating Equations Max. cluster size: 19 Family: Gaussian Mean cluster size: 8.0 Dependence structure: Exchangeable Num. iterations: 3 Date: Wed, 24 Jul 2019 Scale: 0.073 Covariance type: robust Time: 16:27:48 ============================================================================================================ coef std err z P>|z| [0.025 0.975] ------------------------------------------------------------------------------------------------------------ Intercept 0.3460 0.108 3.214 0.001 0.135 0.557 C(decade, Treatment(reference=1))[T.2.0] -0.1880 0.108 -1.741 0.082 -0.400 0.024 C(decade, Treatment(reference=1))[T.3.0] -0.2938 0.108 -2.725 0.006 -0.505 -0.082 C(decade, Treatment(reference=1))[T.4.0] -0.3270 0.108 -3.035 0.002 -0.538 -0.116 C(decade, Treatment(reference=1))[T.5.0] -0.3488 0.108 -3.237 0.001 -0.560 -0.138 closeness_event 0.0002 0.001 0.295 0.768 -0.001 0.002 kcore_event -0.0006 0.001 -0.672 0.502 -0.002 0.001 betweenness_event -0.0033 0.001 -3.980 0.000 -0.005 -0.002 clustering_coeff_event -0.0015 0.001 -2.178 0.029 -0.003 -0.000 career_age -0.0061 0.000 -13.936 0.000 -0.007 -0.005 release_count 0.0249 0.002 11.276 0.000 0.021 0.029 past_success 1.0261 0.003 331.533 0.000 1.020 1.032 ============================================================================== Skew: -2.4159 Kurtosis: 39.7054 Centered skew: -0.6141 Centered kurtosis: 34.5069 ==============================================================================
result.cov_struct.dep_params
0.3469744871809727
fig = res.plot_isotropic_dependence()
plt.grid(True)

I am very confused and I appreciate any tip!
cheers,
Mohsen