Filtering#

Datasets can be filtered on the number of dimensions, samples, timesteps, labels and on dataset name.

For example, the following are valid filter:

  • n_dims<=2: filter datasets with fewer than 3 dimensions

  • dataset=~ProjectA: filter datasets with ProjectA in its name

  • n_labels>2: filter datasets with more than 2 labels

Wildboar currently support the following subjects:

  • n_dims

  • n_samples

  • n_timestep

  • n_labels

  • n_dataset

And the following comparison operators:

  • =~: contains

  • <, <=: less than and less than or equals to

  • >, >=: greater than and greater than or equals to

  • ==: exactly equals to

[1]:
from wildboar.datasets import load_datasets
for name, (x, y) in load_datasets("wildboar/ucr", filter=["n_samples>200", "n_timestep<50"]):
    print(name)
Chinatown
Crop
ItalyPowerDemand
MelbournePedestrian
SmoothSubspace