Allows you to treat imbalanced discrete numeric datasets by generating synthetic minority examples, approximating their probability distribution.
racog(dataset, numInstances, burnin = 100, lag = 20, classAttr = "Class")
dataset |
|
---|---|
numInstances | Integer. Number of new minority examples to generate. |
burnin | Integer. It determines how many examples generated for a given one are going to be discarded firstly. By default, 100. |
lag | Integer. Number of iterations between new generated example for a minority one. By default, 20. |
classAttr |
|
A data.frame
with the same structure as dataset
,
containing the generated synthetic examples.
Approximates minority distribution using Gibbs Sampler. Dataset must be
discretized and numeric. In each iteration, it builds a new sample using a
Markov chain. It discards first burnin
iterations, and from then on,
each lag
iterations, it validates the example as a new minority
example. It generates \(d (iterations-burnin)/lag\) where \(d\) is
minority examples number.
Das, Barnan; Krishnan, Narayanan C.; Cook, Diane J. Racog and Wracog: Two Probabilistic Oversampling Techniques. IEEE Transactions on Knowledge and Data Engineering 27(2015), Nr. 1, p. 222–234.
data(iris0) # Generates new minority examples newSamples <- racog(iris0, numInstances = 40, burnin = 20, lag = 10, classAttr = "Class") # \donttest{ newSamples <- racog(iris0, numInstances = 100) # }