Cluster-Based Local Outlier Factor (CBLOF) is an unsupervised anomaly detection algorithm that identifies outliers in a dataset by taking into account the local density of points.

Core Method

  1. Partition the data into clusters using a clustering algorithm such as k-means.
  2. For each point in a cluster, calculate the local outlier factor (LOF) based on the density of its neighboring points.
  3. Calculate the cluster outlier factor (COF) by combining the LOF scores of all points in a cluster. The COF of a cluster is the average of the LOF scores of its points.
  4. Combine the COF scores of all clusters to obtain the outlier scores of individual points in the dataset.
  5. Identify the points with high outlier scores as potential outliers.

Note that the choice of clustering algorithm and the number of clusters are hyperparameters that need to be chosen by the user. Also, the LOF score calculation is based on the distance metric chosen, and different distance metrics may result in different outlier scores.