Generates synthetic minority examples for a numerical dataset approximating a Gaussian multivariate distribution which best fits the minority data.
pdfos(dataset, numInstances, classAttr = "Class")
dataset |
|
---|---|
numInstances | Integer. Number of new minority examples to generate. |
classAttr |
|
A data.frame
with the same structure as dataset
,
containing the generated synthetic examples.
To generate the synthetic data, it approximates a normal distribution with mean a given example belonging to the minority class, and whose variance is the minority class variance multiplied by a constant; that constant is computed so that it minimizes the mean integrated squared error of a Gaussian multivariate kernel function.
Gao, Ming; Hong, Xia; Chen, Sheng; Harris, Chris J.; Khalaf, Emad. Pdfos: Pdf Estimation Based Oversampling for Imbalanced Two-Class Problems. Neurocomputing 138 (2014), p. 248–259
Silverman, B. W. Density Estimation for Statistics and Data Analysis. Chapman & Hall, 1986. – ISBN 0412246201