Subspace Outlier Detection in Data with Mixture of Variances and Noise
Nguyen, Minh Quoc
MetadataShow full item record
In this paper, we introduce a bottom-up approach to discover clusters of outliers in any m-dimensional subspace from an n-dimensional space. First, we propose a method to compute the outlier score for all points in each dimension. We show that if a point is an outlier in a subspace, the score must be high for that point in each dimension of the subspace. We then aggregate the scores to compute the final outlier score for the points in the dataset. We introduce a filter threshold to eliminate the high dimensional noise during the aggregation. The concept of outlier is extended to allow the discovery of clusters of outliers. An oscore(C/S) function is introduced to rank the clusters accordingly. In addition, the outliers can be easily visualized in our approach.