Random Sampler
Random sampling by mixing under-sampling and over-sampling.
This is a wrapper for classifiers. It will train the provided classifier by both under-sampling and over-sampling the stream of given observations so that the class distribution seen by the classifier follows a given desired distribution.
Parameters
-
classifier(
Model
) - Classifier Model. -
desired_dist(
dict
) → The desired class distribution. The keys are the classes whilst the values are the desired class percentages. The values must sum up to 1. If set to None, then the observations will be sampled uniformly at random, which is stricly equivalent to using ensemble.BaggingClassifier. -
sampling_rate(
int
, Default:1.0
) → The desired ratio of data to sample. -
seed(
int
|None
, Default:None
) → Random seed for reproducibility.
Example Usage
We can create an instance of the Random Sampler like this.
import turboml as tb
htc_model = tb.HoeffdingTreeClassifier(n_classes=2)
sampler_model = tb.RandomSampler(base_model = htc_model)