Generates synthetic minority examples by approximating their probability
distribution until sensitivity of wrapper
over validation
cannot be further improved. Works only on discrete numeric datasets.
wracog( train, validation, wrapper, slideWin = 10, threshold = 0.02, classAttr = "Class", ... )
train |
|
---|---|
validation |
|
wrapper | An |
slideWin | Number of last sensitivities to take into account to meet the stopping criteria. By default, 10. |
threshold | Threshold that the last |
classAttr |
|
... | further arguments for |
A data.frame
with the same structure as train
,
containing the generated synthetic examples.
Until the last slideWin
executions of wrapper
over
validation
dataset reach a mean sensitivity lower than
threshold
, the algorithm keeps generating samples using Gibbs Sampler,
and adding misclassified samples with respect to a model generated by a
former train, to the train dataset. Initial model is built on initial
train
.
Das, Barnan; Krishnan, Narayanan C.; Cook, Diane J. Racog and Wracog: Two Probabilistic Oversampling Techniques. IEEE Transactions on Knowledge and Data Engineering 27(2015), Nr. 1, p. 222–234.
data(haberman) # Create train and validation partitions of haberman trainFold <- sample(1:nrow(haberman), nrow(haberman)/2, FALSE) trainSet <- haberman[trainFold, ] validationSet <- haberman[-trainFold, ] # Defines our own wrapper with a C5.0 tree myWrapper <- structure(list(), class="TestWrapper") trainWrapper.TestWrapper <- function(wrapper, train, trainClass){ C50::C5.0(train, trainClass) } # Execute wRACOG with our own wrapper newSamples <- wracog(trainSet, validationSet, myWrapper, classAttr = "Class")#> Error in UseMethod("trainWrapper"): no applicable method for 'trainWrapper' applied to an object of class "TestWrapper"# Execute wRACOG with predifined wrappers for "KNN" or "C5.0" KNNSamples <- wracog(trainSet, validationSet, "KNN") C50Samples <- wracog(trainSet, validationSet, "C5.0")