wildboar.model_selection#
Methods for model selection.
Classes#
Repeated random outlier cross-validator.  | 
Functions#
  | 
Outlier training and testing split from classification dataset.  | 
- class wildboar.model_selection.RepeatedOutlierSplit(n_splits=None, *, test_size=0.2, n_outlier=0.05, shuffle=True, random_state=None)[source]#
 Repeated random outlier cross-validator.
- Parameters:
 - n_splitsint, optional
 The maximum number of splits. - if None, the number of splits is determined by the number of outliers as, total_n_outliers/(n_inliers * n_outliers) - if int, the number of splits is an upper-bound.
- test_sizefloat, optional
 The size of the test set.
- n_outlierfloat, optional
 The fraction of outliers in the training and test sets.
- shufflebool, optional
 Shuffle the training indicies in each iteration.
- random_stateint or RandomState, optional
 The psudo-random number generator.
Notes
Contrary to other cross-validation strategies, the random outlier cross-validator does not ensure that all folds will be different. Instead, the inlier samples are shuffled and new outlier samples are inserted in the training and test sets repeatedly.
- get_n_splits(X, y, groups=None)[source]#
 Return the number of splitting iterations in the cross-validator.
- Parameters:
 - Xobject
 The samples.
- yobject
 The labels.
- groupsobject, optional
 Always ignored, exists for compatibility.
- Returns:
 - int
 Returns the number of splitting iterations in the cross-validator.
- wildboar.model_selection.outlier_train_test_split(x, y, normal_class, test_size=0.2, anomalies_train_size=0.05, random_state=None)[source]#
 Outlier training and testing split from classification dataset.
- Parameters:
 - xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dim, n_timestep)
 Input data samples.
- yarray-like of shape (n_samples,)
 Input class label.
- normal_classint
 Class label that should be considered as the normal class.
- test_sizefloat, optional
 Size of the test set.
- anomalies_train_sizefloat, optional
 Contamination of anomalies in the training dataset.
- random_stateint or RandomState, optional
 Psudo random state used for stable results.
- Returns:
 - x_trainarray-like
 Training samples.
- x_testarray-like
 Test samples.
- y_trainarray-like
 Training labels (either 1 or -1, where 1 denotes normal and -1 anomalous).
- y_testarray-like
 Test labels (either 1 or -1, where 1 denotes normal and -1 anomalous).
Examples
>>> from wildboar.datasets import load_two_lead_ecg >>> x, y = load_two_lead_ecg() >>> x_train, x_test, y_train, y_test = train_test_split( ... x, y, 1, test_size=0.2, anomalies_train_size=0.05 ... )