This paper proposes a novel definition for distance-based outliers based on the distance of a point from its kth nearest neighbor. The authors define outliers as the top n points with the highest values of this distance measure. They develop a partition-based algorithm that first partitions the data into disjoint subsets, then prunes entire partitions that cannot contain outliers. This results in significant computational savings. The algorithm is tested on real-life and synthetic data sets, showing that it scales well with both data size and dimensionality. The results from a real-life NBA database highlight unexpected aspects of the data, while synthetic data experiments demonstrate the algorithm's efficiency. The paper also compares the proposed algorithm with existing methods, showing that it is more than an order of magnitude faster in many cases. The key contributions include a new outlier definition, a partition-based algorithm with pruning, and extensive experimental results. The algorithm uses clustering to generate partitions, computes bounds on the distance measure for each partition, and prunes partitions that cannot contain outliers. The final step computes outliers from the remaining points. The paper also discusses related work, including clustering algorithms and previous outlier detection methods. The authors conclude that their approach is effective for finding outliers in large datasets.This paper proposes a novel definition for distance-based outliers based on the distance of a point from its kth nearest neighbor. The authors define outliers as the top n points with the highest values of this distance measure. They develop a partition-based algorithm that first partitions the data into disjoint subsets, then prunes entire partitions that cannot contain outliers. This results in significant computational savings. The algorithm is tested on real-life and synthetic data sets, showing that it scales well with both data size and dimensionality. The results from a real-life NBA database highlight unexpected aspects of the data, while synthetic data experiments demonstrate the algorithm's efficiency. The paper also compares the proposed algorithm with existing methods, showing that it is more than an order of magnitude faster in many cases. The key contributions include a new outlier definition, a partition-based algorithm with pruning, and extensive experimental results. The algorithm uses clustering to generate partitions, computes bounds on the distance measure for each partition, and prunes partitions that cannot contain outliers. The final step computes outliers from the remaining points. The paper also discusses related work, including clustering algorithms and previous outlier detection methods. The authors conclude that their approach is effective for finding outliers in large datasets.