Comparing classifiers#

In this example we compare the predictive performance of several time series classifiers over multiple datasets.

[1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import cross_validate
from sklearn.neighbors import KNeighborsClassifier

from wildboar.datasets import list_datasets, load_dataset
from wildboar.ensemble import ExtraShapeletTreesClassifier, ShapeletForestClassifier
from wildboar.linear_model import RocketClassifier

random_state = 1234

First, we define a dictionary of classifiers that will be used while comparing the predictive performance.

[2]:
classifiers = {
    "Nearest neighbors": KNeighborsClassifier(
        n_neighbors=1,
        metric="euclidean",
    ),
    "Shapelet forest": ShapeletForestClassifier(
        n_shapelets=10,
        metric="scaled_euclidean",
        random_state=random_state,
        n_jobs=-1,
    ),
    "Extra shapelet trees": ExtraShapeletTreesClassifier(
        metric="scaled_euclidean",
        n_jobs=-1,
        random_state=random_state,
    ),
    "ROCKET": RocketClassifier(random_state=random_state, n_jobs=-1),
}

Next, we define a dataset repository (in this example we use a subset of the UCR time series datasets) and list all datasets in the repository.

[3]:
repository = "wildboar/ucr-tiny"
datasets = list_datasets(repository)

Next, we define a dataframe in which we collect the results. Each row of the dataframe is a dataset, each column a time series classifier and each cell contains the predictive performance of the classifier on the dataset.

[4]:
df = pd.DataFrame(columns=classifiers.keys(), index=datasets, dtype=float)

Finally, we compute the area under the ROC using cross-validation.

[5]:
for dataset in datasets:
    x, y = load_dataset(dataset, repository=repository)
    for clf_name, clf in classifiers.items():
        score = cross_validate(clf, x, y, scoring="roc_auc_ovo", n_jobs=1)
        df.loc[dataset, clf_name] = np.mean(score["test_score"])
[6]:
df
[6]:
Nearest neighbors Shapelet forest Extra shapelet trees ROCKET
Beef 0.762500 0.882083 0.865278 0.934167
Coffee 1.000000 1.000000 1.000000 1.000000
GunPoint 0.945000 1.000000 1.000000 1.000000
SyntheticControl 0.949000 0.999667 0.999800 1.000000
TwoLeadECG 0.996559 0.999970 0.999985 1.000000