Shapelet forests and extremly randomized shapelet trees#
In this example, we explore the training time and predictive performance of the random shapelet forest and the extremly randomized shapelet trees algorithm.
[1]:
import numpy as np
from sklearn.model_selection import cross_validate
from wildboar.datasets import load_dataset
from wildboar.ensemble import ExtraShapeletTreesClassifier, ShapeletForestClassifier
random_state = 1234
First, we load the datasets merging any existing training and testing partitions.
[2]:
x, y = load_dataset("Beef")
Next, we setup the two classifiers we want to compare.
[3]:
classifiers = {
"Shapelet forest": ShapeletForestClassifier(
n_shapelets=10,
metric="scaled_euclidean",
n_jobs=-1,
random_state=random_state,
),
"Extra Shapelet Trees": ExtraShapeletTreesClassifier(
metric="scaled_euclidean",
n_jobs=-1,
random_state=random_state,
),
}
Finally, we iterate over the classifiers and compute the cross-validation area under ROC. We also print the time it takes to train the algorithms and their respective performance.
[5]:
for name, clf in classifiers.items():
score = cross_validate(clf, x, y, scoring="roc_auc_ovo", n_jobs=1)
print(f"Classifier: {name}")
print(" - fit-time: %.2f" % np.mean(score["fit_time"]))
print(" - test-score: %.2f" % np.mean(score["test_score"]))
Classifier: Shapelet forest
- fit-time: 0.75
- test-score: 0.88
Classifier: Extra Shapelet Trees
- fit-time: 0.17
- test-score: 0.87