Download SCOUT User`s Guide
Transcript
Chapter 14 Statistical Procedures lead to incorrect remediation decisions. The MLEs-based classical and even the robust outlier identification procedures are vulnerable to masking and swamping effects in the presence of multiple outliers. Masking means that the outliers are hidden, and the presence of some outliers may mask the existence of others. Even the sequential use of the outlier identification procedures can not help unmask these multiple outliers (e.g., see Example 1, Chapter 10). When the outliers arise in clusters, the OLS regression model gets attracted toward the outliers resulting in deflated residuals, leading to masking of outliers. Swamping, on the other hand, means that some of the inlying observations are identified as outliers due to the presence of some other outliers. In the presence of multiple outliers, or for a mixture sample from two or more populations, the generalized distances including robustified Mds get distorted to such an extent that the cases with large Mds may not correspond to the outlying observations. This data masking distorts the estimates of the population parameters (e.g., ) and the correct ordering of the Mds in an unpredictable manner and often leads to the misidentification of outliers. The use of approximate distributions of the Mds, such as chi-square or normal can also lead to the incorrect ordering of the Mds. It is well known (Huber [1981], Devlin et al. [1981], Hampel et al. [1986], Rousseeuw and Leroy [1987], Rousseeuw and van Zomeren [1990], and Barnett and Lewis [1994]) that for the identification of multiple outliers, one should use robust and resistant procedures with a high breakdown point. Most of the robust outlier identification procedures for the identification of outliers and the estimation of population parameters of location and scale are iterative, requiring Scout User's Guide 14-4