Shapelet forests and extremly randomized shapelet trees#

In this example, we explore the training time and predictive performance of the random shapelet forest and the extremly randomized shapelet trees algorithm.

[1]:

import numpy as np
from sklearn.model_selection import cross_validate

from wildboar.datasets import load_dataset
from wildboar.ensemble import ExtraShapeletTreesClassifier, ShapeletForestClassifier

random_state = 1234

First, we load the datasets merging any existing training and testing partitions.

[2]:

x, y = load_dataset("Beef")

Next, we setup the two classifiers we want to compare.

[3]:

classifiers = {
    "Shapelet forest": ShapeletForestClassifier(
        n_shapelets=10,
        metric="scaled_euclidean",
        n_jobs=-1,
        random_state=random_state,
    ),
    "Extra Shapelet Trees": ExtraShapeletTreesClassifier(
        metric="scaled_euclidean",
        n_jobs=-1,
        random_state=random_state,
    ),
}

Finally, we iterate over the classifiers and compute the cross-validation area under ROC. We also print the time it takes to train the algorithms and their respective performance.

[5]:

for name, clf in classifiers.items():
    score = cross_validate(clf, x, y, scoring="roc_auc_ovo", n_jobs=1)
    print(f"Classifier: {name}")
    print(" - fit-time:   %.2f" % np.mean(score["fit_time"]))
    print(" - test-score: %.2f" % np.mean(score["test_score"]))

Classifier: Shapelet forest
 - fit-time:   0.75
 - test-score: 0.88
Classifier: Extra Shapelet Trees
 - fit-time:   0.17
 - test-score: 0.87