wildboar.ensemble#

Package Contents#

Classes#

ShapeletForestClassifier

An ensemble of random shapelet tree classifiers.

ExtraShapeletTreesClassifier

An ensemble of extremely random shapelet trees for time series regression.

ShapeletForestRegressor

An ensemble of random shapelet regression trees.

ExtraShapeletTreesRegressor

An ensemble of extremely random shapelet trees for time series regression.

IsolationShapeletForest

A isolation shapelet forest.

ShapeletForestEmbedding

An ensemble of random shapelet trees

class wildboar.ensemble.ShapeletForestClassifier(*, n_estimators=100, n_shapelets=10, max_depth=None, min_samples_split=2, min_shapelet_size=0, max_shapelet_size=1, metric='euclidean', metric_params=None, oob_score=False, bootstrap=True, warm_start=False, n_jobs=None, random_state=None)#

Bases: BaseShapeletForestClassifier

An ensemble of random shapelet tree classifiers.

Examples

>>> from wildboar.ensemble import ShapeletForestClassifier
>>> from wildboar.datasets import load_synthetic_control
>>> x, y = load_synthetic_control()
>>> f = ShapeletForestClassifier(n_estimators=100, metric='scaled_euclidean')
>>> f.fit(x, y)
>>> y_hat = f.predict(x)
class wildboar.ensemble.ExtraShapeletTreesClassifier(*, n_estimators=100, max_depth=None, min_samples_split=2, min_shapelet_size=0, max_shapelet_size=1, metric='euclidean', metric_params=None, oob_score=False, bootstrap=True, warm_start=False, n_jobs=None, random_state=None)#

Bases: BaseShapeletForestClassifier

An ensemble of extremely random shapelet trees for time series regression.

Examples

>>> from wildboar.ensemble import ExtraShapeletTreesClassifier
>>> from wildboar.datasets import load_synthetic_control
>>> x, y = load_synthetic_control()
>>> f = ExtraShapeletTreesClassifier(n_estimators=100, metric='scaled_euclidean')
>>> f.fit(x, y)
>>> y_hat = f.predict(x)
class wildboar.ensemble.ShapeletForestRegressor(*, n_estimators=100, n_shapelets=10, max_depth=None, min_samples_split=2, min_shapelet_size=0, max_shapelet_size=1, metric='euclidean', metric_params=None, oob_score=False, bootstrap=True, warm_start=False, n_jobs=None, random_state=None)#

Bases: BaseShapeletForestRegressor

An ensemble of random shapelet regression trees.

Examples

>>> from wildboar.ensemble import ShapeletForestRegressor
>>> from wildboar.datasets import load_synthetic_control
>>> x, y = load_synthetic_control()
>>> f = ShapeletForestRegressor(n_estimators=100, metric='scaled_euclidean')
>>> f.fit(x, y)
>>> y_hat = f.predict(x)
class wildboar.ensemble.ExtraShapeletTreesRegressor(*, n_estimators=100, max_depth=None, min_samples_split=2, min_shapelet_size=0, max_shapelet_size=1, metric='euclidean', metric_params=None, oob_score=False, bootstrap=True, warm_start=False, n_jobs=None, random_state=None)#

Bases: BaseShapeletForestRegressor

An ensemble of extremely random shapelet trees for time series regression.

Examples

>>> from wildboar.ensemble import ExtraShapeletTreesRegressor
>>> from wildboar.datasets import load_synthetic_control
>>> x, y = load_synthetic_control()
>>> f = ExtraShapeletTreesRegressor(n_estimators=100, metric='scaled_euclidean')
>>> f.fit(x, y)
>>> y_hat = f.predict(x)
class wildboar.ensemble.IsolationShapeletForest(*, n_estimators=100, bootstrap=False, n_jobs=None, min_shapelet_size=0, max_shapelet_size=1, min_samples_split=2, max_samples='auto', contamination='auto', contamination_set='training', warm_start=False, metric='euclidean', metric_params=None, random_state=None)#

Bases: ShapeletForestMixin, sklearn.base.OutlierMixin, sklearn.ensemble._bagging.BaseBagging

A isolation shapelet forest.

New in version 0.3.5.

offset_#

The offset for computing the final decision

Type:

float

Examples

>>> from wildboar.ensemble import IsolationShapeletForest
>>> from wildboar.datasets import load_two_lead_ecg
>>> from model_selection.outlier import train_test_split
>>> from sklearn.metrics import balanced_accuracy_score
>>> x, y = load_two_lead_ecg("two_lead_ecg")
>>> x_train, x_test, y_train, y_test = train_test_split(x, y, 1, test_size=0.2, anomalies_train_size=0.05)
>>> f = IsolationShapeletForest(n_estimators=100, contamination=balanced_accuracy_score)
>>> f.fit(x_train, y_train)
>>> y_pred = f.predict(x_test)
>>> balanced_accuracy_score(y_test, y_pred)

Or using default offset threshold

>>> from wildboar.ensemble import IsolationShapeletForest
>>> from wildboar.datasets import load_two_lead_ecg
>>> from model_selection.outlier import train_test_split
>>> from sklearn.metrics import balanced_accuracy_score
>>> f = IsolationShapeletForest()
>>> x, y = load_two_lead_ecg("two_lead_ecg")
>>> x_train, x_test, y_train, y_test = train_test_split(x, y, 1, test_size=0.2, anomalies_train_size=0.05)
>>> f.fit(x_train)
>>> y_pred = f.predict(x_test)
>>> balanced_accuracy_score(y_test, y_pred)
fit(x, y=None, sample_weight=None, check_input=True)#

Build a Bagging ensemble of estimators from the training set (X, y).

Parameters:
  • X ({array-like, sparse matrix} of shape (n_samples, n_features)) – The training input samples. Sparse matrices are accepted only if they are supported by the base estimator.

  • y (array-like of shape (n_samples,)) – The target values (class labels in classification, real numbers in regression).

  • sample_weight (array-like of shape (n_samples,), default=None) – Sample weights. If None, then samples are equally weighted. Note that this is supported only if the base estimator supports sample weighting.

Returns:

self – Fitted estimator.

Return type:

object

predict(x)#
decision_function(x)#
score_samples(x)#
class wildboar.ensemble.ShapeletForestEmbedding(n_estimators=100, *, n_shapelets=1, max_depth=5, min_samples_split=2, min_shapelet_size=0, max_shapelet_size=1, metric='euclidean', metric_params=None, bootstrap=True, warm_start=False, n_jobs=None, sparse_output=True, random_state=None)#

Bases: BaseShapeletForestRegressor

An ensemble of random shapelet trees

An unsupervised transformation of a time series dataset to a high-dimensional sparse representation. A time series i indexed by the leaf that it falls into. This leads to a binary coding of a time series with as many ones as trees in the forest.

The dimensionality of the resulting representation is <= n_estimators * 2^max_depth

fit(x, y=None, sample_weight=None, check_input=True)#

Fit a random shapelet forest regressor

fit_transform(x, y=None, sample_weight=None, check_input=True)#
transform(x)#