M Estimators Spss

1 view

Skip to first unread message

Blanchefle Strycker

unread,

Aug 4, 2024, 8:13:12 PM8/4/24

to thermemame

Instatistics, M-estimators are a broad class of extremum estimators for which the objective function is a sample average.[1] Both non-linear least squares and maximum likelihood estimation are special cases of M-estimators. The definition of M-estimators was motivated by robust statistics, which contributed new types of M-estimators.[citation needed] However, M-estimators are not inherently robust, as is clear from the fact that they include maximum likelihood estimators, which are in general not robust. The statistical procedure of evaluating an M-estimator on a data set is called M-estimation.

More generally, an M-estimator may be defined to be a zero of an estimating function.[2][3][4][5][6][7] This estimating function is often the derivative of another statistical function. For example, a maximum-likelihood estimate is the point where the derivative of the likelihood function with respect to the parameter is zero; thus, a maximum-likelihood estimator is a critical point of the score function.[8] In many applications, such M-estimators can be thought of as estimating characteristics of the population.

Another popular M-estimator is maximum-likelihood estimation. For a family of probability density functions f parameterized by θ, a maximum likelihood estimator of θ is computed for each set of data by maximizing the likelihood function over the parameter space θ . When the observations are independent and identically distributed, a ML-estimate θ ^ \displaystyle \hat \theta satisfies

Maximum-likelihood estimators have optimal properties in the limit of infinitely many observations under rather general conditions, but may be biased and not the most efficient estimators for finite samples.

are called M-estimators ("M" for "maximum likelihood-type" (Huber, 1981, page 43)); other types of robust estimators include L-estimators, R-estimators and S-estimators. Maximum likelihood estimators (MLE) are thus a special case of M-estimators. With suitable rescaling, M-estimators are special cases of extremum estimators (in which more general functions of the observations can be used).

The function ρ, or its derivative, ψ, can be chosen in such a way to provide the estimator desirable properties (in terms of bias and efficiency) when the data are truly from the assumed distribution, and 'not bad' behaviour when the data are generated from a model that is, in some sense, close to the assumed distribution.

This minimization can always be done directly. Often it is simpler to differentiate with respect to θ and solve for the root of the derivative. When this differentiation is possible, the M-estimator is said to be of ψ-type. Otherwise, the M-estimator is said to be of ρ-type.

For some choices of ψ, specifically, redescending functions, the solution may not be unique. The issue is particularly relevant in multivariate and regression problems. Thus, some care is needed to ensure that good starting points are chosen. Robust starting points, such as the median as an estimate of location and the median absolute deviation as a univariate estimate of scale, are common.

where g is, there is some function to be found. Now, we can rewrite the original objective function solely in terms of β by inserting the function g into the place of γ \displaystyle \gamma . As a result, there is a reduction in the number of parameters.

Whether this procedure can be done depends on particular problems at hand. However, when it is possible, concentrating parameters can facilitate computation to a great degree. For example, in estimating SUR model of 6 equations with 5 explanatory variables in each equation by Maximum Likelihood, the number of parameters declines from 51 to 30.[9]

Despite its appealing feature in computation, concentrating parameters is of limited use in deriving asymptotic properties of M-estimator.[10] The presence of W in each summand of the objective function makes it difficult to apply the law of large numbers and the central limit theorem.

It can be shown that M-estimators are asymptotically normally distributed. As such, Wald-type approaches to constructing confidence intervals and hypothesis tests can be used. However, since the theory is asymptotic, it will frequently be sensible to check the distribution, perhaps by examining the permutation or bootstrap distribution.

After studying Stata for about half a year my department asked me to tellthem some more about STATA. One of the things my colleages are interested inis what they can do with STATA that they can't do with SPSS. Since I am notvery familiar with SPSS I hope to find an answere on the list. Of course Iknow allready about the great possibilities of programming but I hope tofind some answers about not to exotic statistical methods.

I have both Stata and SPSS on my computer. In my opinion, SPSS has only two slight advantages and many, many disadvantages. The two advantages are that it is slightly more user friendly in making complex tables and graphs. But thanks to people like Nick Cox, that difference is decreasing daily. Second, SPSS has a nice routine in their logistic regression model for testing interactions. That is a trivial advantage, however. I have heard that the ANOVA commands in SPSS is also user friendly. I don't use them, however.

The only reason that I keep SPSS on my machine is that I am not pressed for disk space. I rarely use it, whereas I use Stata almost every day. Ever try to run a probit in SPSS? Nearly impossible and the documentation stinks. On the other hand, it is a breeze in Stata.

I don't know if it is a big difference or not, since I don't use SPSS allthat much, but Stata has the best support system I have ever seen in anysoftware product. Not only the Stata Staff, but many Stata users respond tothe most basic, and complex, questions presented. This is a fantasticadvantage to anyone who uses the product.

The bottom line is that SPSS doesn't do much, although it is (perhaps too) easy to use. For example, it's useful multivariate analysis procedures are pretty much limited to OLS, probit, and logit, with a few less useful additional procedures avialable. SPSS does not have the multiple pooled cross sectional time series routines that Stata has. There are no count procedures (Poisson, negative binomial and the zero routines), and other maximum likelihood estimators such as Tobit, multinomial logit, ordinal logit or probit, and complementary log-log models are not readily avialable.

Additional problems with SPSS include no Huber-White correction for heteroskedascity, and none of Stata's extensive tests that are available after estimation. The anova routines in SPSS are not nearly as comprehensive as those in Stata. The last time I looked at SPSS there weren't any provisions for Cox regression and the other extensive duration analysis procedures that Stata offers. In short, anyone who limits themselves to SPSS would be quite handicaped.

One of the things you can do with Stata that you can't do with SPSS is estimate models for complex surveys. Most SPSS procedures will allow weights, but although these will produce correct estimates, the standard errors will be too small (aweights or iweights versus pweights). SPSS cannot take clustering into account at all. This is an important issue, most surveys use a weight variable to take stratification and/or sampling bias (random or due to non-response) into account, but standard programs can lead to incorrect inferences on statistical significance.

There are a lot of user-written programs out there and -webseek- makes it much easier to find solutions to non-standard problems. These problems need not be exotic, one problem that fired up a lot of discussion among a group of us was the comparison of coefficients of nested logistic models. With a downloadable ado file, standardized coefficients and marginal effects can be calculated easily. The only way to do that in SPSS is with a macro that estimates a logistic model using matrix facilities (if you happen to have such a macro, it wouldn't be easy to write one). Alternative fit measures like BIC, AIC, pseudo R^2 measures can be easily added to Stata, in SPSS you'd have to write a visual basic script (assuming that would work).

Stata also has excellent programs for event history analysis or panel data analysis, but perhaps these are "exotic" methods according to you or your colleagues. Well, SPSS is good enough for most purposes, most of the time. What annoys me about SPSS is that it's pace of development is so slow. Only a handful of statistical procedures have been added in the last five years: GLM, NOMREG, PLUM, one or two others. Just glance through a few STBs for comparison. SPSS has concentrated on graphical output since 1995, to the annoyance of many users. Their implementation is an interface nightmare, you have to navigate two scrollbars just to view your *text* output! To hide elements of their pivot tables you choose "hide" from a right-click menu, except in some cases where you choose "ungroup". Add the bugs in the last release and the expensive price/lease, and you've got plenty of arguments in favour of Stata.

SPSS has been around for a very long time; it started off on mainframes,made it to DOS, OS/2 and finally to Windows. Because of its mainframeorigins, SPSS started life as a 'data filter'. The data records wereprocessed through a procedure or set of procedures and the results generatedin an output stream. In this way, the data was read from disk file for eachset of procedures carried out, but not retained in memory. The result wasthat very large quantities of data could be handled, on computers withlimited memory. With RAM costing about $1 per megabyte, this method onlyserves to slow SPSS down.

Many statistical procedures can be thought of as filters, although this doesnot apply to techniques such as cluster analysis. The modern PC hasdeveloped a moderately complex memory model, with disk caching playing amajor role, and in a number of areas SPSS has moved beyond the 'datafilter', but much of its operation is still conditioned by this way ofworking. The interface is very interactive in a computing sense, with mouse,menus, dialog boxes and a help system, but in a statistical sense itsoperations are generally rather less interactive.