Wrapper that encapsulates a collection of algorithms to perform a class balancing preprocessing task for binary class datasets

oversample(
  dataset,
  ratio = NA,
  method = c("RACOG", "wRACOG", "PDFOS", "RWO", "ADASYN", "ANSMOTE", "SMOTE", "MWMOTE",
    "BLSMOTE", "DBSMOTE", "SLMOTE", "RSLSMOTE"),
  filtering = FALSE,
  classAttr = "Class",
  wrapper = c("KNN", "C5.0"),
  ...
)

Arguments

dataset

A binary class data.frame to balance.

ratio

Number between 0 and 1 indicating the desired ratio between minority examples and majority ones, that is, the quotient size of minority class/size of majority class. There are methods, such as ADASYN or wRACOG to which this parameter does not apply.

method

A character corresponding to method to apply. Possible methods are: RACOG, wRACOG, PDFOS, RWO, ADASYN, ANSMOTE, SMOTE, MWMOTE, BLSMOTE, DBSMOTE, SLMOTE, RSLSMOTE

filtering

Logical (TRUE or FALSE) indicating wheter to apply filtering of oversampled instances with neater algorithm.

classAttr

character. Indicates the class attribute from dataset. Must exist in it.

wrapper

A character corresponding to wrapper to apply if selected method is wracog. Possibilities are: "C5.0" and "KNN".

...

Further arguments to apply in selected method

Value

A balanced data.frame with same structure as dataset, containing both original instances and new ones

Examples

data(glass0) # Oversample glass0 to get an imbalance ratio of 0.8 imbalanceRatio(glass0)
#> [1] 0.4861111
# 0.4861111 newDataset <- oversample(glass0, ratio = 0.8, method = "MWMOTE") imbalanceRatio(newDataset)
#> [1] 0.8055556
newDataset <- oversample(glass0, method = "ADASYN") newDataset <- oversample(glass0, ratio = 0.8, method = "SMOTE")