baysout {dprep} | R Documentation |
This function implements the algorithm for outlier detection found in Bay and Schwabacher(2003). The algorithm assigns an outlyingness measure to each observation and returns the indexes of those observations having the largest measures. The number of outliers to be returned is specified by the user.
baysout(D, blocks = 5, k = 3, num.out = 10)
D |
the dataset under study |
blocks |
the number of sections in which to divide the entire dataset. It must be at least as large as the number of outliers requested. |
k |
the number of neighbors to find for each observation |
num.out |
the number of outliers to return |
num.out |
Returns a two column matrix containing the indexes of the observations with the top num.out outlyingness measures. A plot of the top candidates and their measures is also displayed. |
Caroline Rodriguez(2004). Modified by Elio Lozano (2005)
Bay, S.D., and Schwabacher (2003). Mining distance-based outliers in near linear time with randomization and a simple pruning rule.
#---- Outliers detection using the Bay's algorithm---- data(bupa) bupa.out=baysout(bupa[bupa[,7]==1,1:6],blocks=10,num.out=10)