Download SCOUT User`s Guide

Transcript
Chapter 14
Statistical Procedures
in Jennings and Young [1988], and Stapanian et al. [1991]. Computation of the critical values of
the test-statistic, Max (Mds), can be easily incorporated in a software package. A sequential outlier
detection procedure based on the test-statistic, Max(Mds) and multivariate kurtosis have been
included in the classical method menu in Scout. The robust module of Scout computes these critical
values and uses them on the Q-Q and index plots of the generalized distances, Mds, to formally
define and identify outliers.
Most outlier identification statistics, including the Max(Mds), multivariate kurtosis, and the
minimum volume ellipsoid (MVE), are functions of the Mds, which depend upon the estimates of
population location and scale. The presence of outliers usually results in distorted and unreliable
maximum likelihood estimates (MLEs) and ordinary least-squares (OLS) estimates of the population
parameters. The classical MLEs of mean and variance have a "zero" breakdown point. The
breakdown point of an estimator is the smallest possible fraction of observations that have to be
replaced to distort the estimator without any bounds (Hampel [1974]). "Zero" breakdown point of
an estimator means that the presence of even a single outlier can completely distort the statistic
under consideration. Thus, all other related statistics, including interval estimates, principal
components (PCs), and the estimates of regression parameters, get distorted by outliers. This means
that the test statistics and inference based on these classical estimates may be misleading. For
example, in an environmental monitoring application, it is quite possible that the classification
procedure based upon the distorted estimates may classify a contaminated sample as coming from
the clean population and a clean sample as coming from the contaminated part of the site. This may
Scout User's Guide
14-3