Comparing classifiers#
In this example we compare the predictive performance of several time series classifiers over multiple datasets.
[1]:
import numpy as np
import pandas as pd
from sklearn.model_selection import cross_validate
from sklearn.neighbors import KNeighborsClassifier
from wildboar.datasets import list_datasets, load_dataset
from wildboar.ensemble import ExtraShapeletTreesClassifier, ShapeletForestClassifier
from wildboar.linear_model import RocketClassifier
random_state = 1234
First, we define a dictionary of classifiers that will be used while comparing the predictive performance.
[2]:
classifiers = {
"Nearest neighbors": KNeighborsClassifier(
n_neighbors=1,
metric="euclidean",
),
"Shapelet forest": ShapeletForestClassifier(
n_shapelets=10,
metric="scaled_euclidean",
random_state=random_state,
n_jobs=-1,
),
"Extra shapelet trees": ExtraShapeletTreesClassifier(
metric="scaled_euclidean",
n_jobs=-1,
random_state=random_state,
),
"ROCKET": RocketClassifier(random_state=random_state, n_jobs=-1),
}
Next, we define a dataset repository (in this example we use a subset of the UCR time series datasets) and list all datasets in the repository.
[3]:
repository = "wildboar/ucr-tiny"
datasets = list_datasets(repository)
Next, we define a dataframe in which we collect the results. Each row of the dataframe is a dataset, each column a time series classifier and each cell contains the predictive performance of the classifier on the dataset.
[4]:
df = pd.DataFrame(columns=classifiers.keys(), index=datasets, dtype=float)
Finally, we compute the area under the ROC using cross-validation.
[5]:
for dataset in datasets:
x, y = load_dataset(dataset, repository=repository)
for clf_name, clf in classifiers.items():
score = cross_validate(clf, x, y, scoring="roc_auc_ovo", n_jobs=1)
df.loc[dataset, clf_name] = np.mean(score["test_score"])
[6]:
df
[6]:
| Nearest neighbors | Shapelet forest | Extra shapelet trees | ROCKET | |
|---|---|---|---|---|
| Beef | 0.762500 | 0.882083 | 0.865278 | 0.934167 |
| Coffee | 1.000000 | 1.000000 | 1.000000 | 1.000000 |
| GunPoint | 0.945000 | 1.000000 | 1.000000 | 1.000000 |
| SyntheticControl | 0.949000 | 0.999667 | 0.999800 | 1.000000 |
| TwoLeadECG | 0.996559 | 0.999970 | 0.999985 | 1.000000 |