
Linear methods for both classification and regression.

Package Contents#



A dictionary based method using dilated competing shapelets.


A classifier that uses random dilated shapelets.


A Dictionary based method using convolutional kernels.


A classifier that uses random shapelets.


A regressor that uses random shapelets.


Implements the ROCKET classifier.


Implements the ROCKET regressor.

class wildboar.linear_model.CastorClassifier(n_groups=64, n_shapelets=8, *, metric='euclidean', metric_params=None, normalize_prob=0.8, shapelet_size=11, lower=0.05, upper=0.1, order=1, soft_min=True, soft_max=False, soft_threshold=True, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize='sparse', random_state=None, n_jobs=None)[source]#

A dictionary based method using dilated competing shapelets.

n_groupsint, optional

The number of groups of dilated shapelets.

n_shapeletsint, optional

The number of dilated shapelets per group.

metricstr or callable, optional

The distance metric

See _METRICS.keys() for a list of supported metrics.

metric_paramsdict, optional

Parameters to the metric.

Read more about the parameters in the User guide.

normalize_probfloat, optional

The probability of standardizing a shapelet with zero mean and unit standard deviation.

shapelet_sizeint, optional

The length of the dilated shapelet.

lowerfloat, optional

The lower percentile to draw distance thresholds above.

upperfloat, optional

The upper percentile to draw distance thresholds below.

orderint or array-like, optional

The order of difference.

If int, half the groups with corresponding shapelets will be convolved with the order discrete difference along the time dimension.

soft_minbool, optional

If True, use the sum of minimal distances. Otherwise, use the count of minimal distances.

soft_maxbool, optional

If True, use the sum of maximal distances. Otherwise, use the count of maximal distances.

soft_thresholdbool, optional

If True, count the time steps below the threshold for all shapelets. Otherwise, count the time steps below the threshold for the shapelet with the minimal distance.

alphasarray-like of shape (n_alphas,), optional

Array of alpha values to try.

fit_interceptbool, optional

Whether to calculate the intercept for this model.

scoringstr, callable, optional

A string or a scorer callable object with signature scorer(estimator, X, y).

cvint, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy.

class_weightdict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}.

normalize“sparse” or bool, optional

Standardize before fitting. By default use datasets.preprocess.SparseScaler to standardize the attributes. Set to False to disable or True to use StandardScaler.

random_stateint or RandomState, optional

Controls the random sampling of kernels.

  • If int, random_state is the seed used by the random number generator.

  • If numpy.random.RandomState instance, random_state is the random number generator.

  • If None, the random number generator is the numpy.random.RandomState instance used by numpy.random.

n_jobsint, optional

The number of parallel jobs.


Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.


A MetadataRequest encapsulating routing information.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]#

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


Mean accuracy of self.predict(X) w.r.t. y.


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.

class wildboar.linear_model.DilatedShapeletClassifier(n_shapelets=1000, *, metric='euclidean', metric_params=None, normalize_prob=0.8, min_shapelet_size=None, max_shapelet_size=None, shapelet_size=None, lower=0.05, upper=0.1, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize=True, random_state=None, n_jobs=None)[source]#

A classifier that uses random dilated shapelets.

n_shapeletsint, optional

The number of dilated shapelets.

metricstr or callable, optional

The distance metric

See _METRICS.keys() for a list of supported metrics.

metric_paramsdict, optional

Parameters to the metric.

Read more about the parameters in the User guide.

normalize_probfloat, optional

The probability of standardizing a shapelet with zero mean and unit standard deviation.

min_shapelet_sizefloat, optional

The minimum shapelet size. If None, use the discrete sizes in shapelet_size.

max_shapelet_sizefloat, optional

The maximum shapelet size. If None, use the discrete sizes in shapelet_size.

shapelet_sizearray-like, optional

The size of shapelets.

lowerfloat, optional

The lower percentile to draw distance thresholds above.

upperfloat, optional

The upper percentile to draw distance thresholds below.

alphasarray-like of shape (n_alphas,), optional

Array of alpha values to try.

fit_interceptbool, optional

Whether to calculate the intercept for this model.

scoringstr, callable, optional

A string or a scorer callable object with signature scorer(estimator, X, y).

cvint, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy.

class_weightdict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}.

normalizebool, optional

Standardize before fitting.

random_stateint or RandomState, optional

Controls the random sampling of kernels.

  • If int, random_state is the seed used by the random number generator.

  • If numpy.random.RandomState instance, random_state is the random number generator.

  • If None, the random number generator is the numpy.random.RandomState instance used by numpy.random.

n_jobsint, optional

The number of parallel jobs.


Antoine Guillaume, Christel Vrain, Elloumi Wael

Random Dilated Shapelet Transform: A New Approach for Time Series Shapelets Pattern Recognition and Artificial Intelligence, 2022


Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.


A MetadataRequest encapsulating routing information.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]#

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


Mean accuracy of self.predict(X) w.r.t. y.


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.

class wildboar.linear_model.HydraClassifier(*, n_groups=64, n_kernels=8, kernel_size=9, sampling='normal', sampling_params=None, order=1, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize='sparse', n_jobs=None, random_state=None)[source]#

A Dictionary based method using convolutional kernels.

n_groupsint, optional

The number of groups of kernels.

n_kernelsint, optional

The number of kernels per group.

kernel_sizeint, optional

The size of the kernel.

sampling{“normal”}, optional

The strategy for sampling kernels. By default kernel weights are sampled from a normal distribution with zero mean and unit standard deviation.

sampling_paramsdict, optional

Parameters to the sampling approach. The “normal” sampler accepts two parameters: mean and scale.

orderint, optional

The order of difference. If set, half the groups with corresponding kernels will be convolved with the order discrete difference along the time dimension.

alphasarray-like of shape (n_alphas,), optional

Array of alpha values to try.

fit_interceptbool, optional

Whether to calculate the intercept for this model.

scoringstr, callable, optional

A string or a scorer callable object with signature scorer(estimator, X, y).

cvint, cross-validation generator or an iterable, optional

Determines the cross-validation splitting strategy.

class_weightdict or ‘balanced’, optional

Weights associated with classes in the form {class_label: weight}.

normalizebool, optional

Standardize before fitting. By default use datasets.preprocess.SparseScaler to standardize the attributes. Set to False to disable or True to use StandardScaler.

n_jobsint, optional

The number of jobs to run in parallel. A value of None means using a single core and a value of -1 means using all cores. Positive integers mean the exact number of cores.

random_stateint or RandomState, optional

Controls the random resampling of the original dataset.

  • If int, random_state is the seed used by the random number generator.

  • If numpy.random.RandomState instance, random_state is the random number generator.

  • If None, the random number generator is the numpy.random.RandomState instance used by numpy.random.


Dempster, A., Schmidt, D. F., & Webb, G. I. (2023).

Hydra: competing convolutional kernels for fast and accurate time series classification. Data Mining and Knowledge Discovery


Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.


A MetadataRequest encapsulating routing information.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]#

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


Mean accuracy of self.predict(X) w.r.t. y.


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.

class wildboar.linear_model.RandomShapeletClassifier(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.1, max_shapelet_size=1.0, alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, class_weight=None, n_jobs=None, random_state=None)[source]#

A classifier that uses random shapelets.


Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.


A MetadataRequest encapsulating routing information.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]#

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


Mean accuracy of self.predict(X) w.r.t. y.


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.

class wildboar.linear_model.RandomShapeletRegressor(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.1, max_shapelet_size=1.0, alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, gcv_mode=None, n_jobs=None, random_state=None)[source]#

A regressor that uses random shapelets.


Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.


A MetadataRequest encapsulating routing information.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]#

Return the coefficient of determination of the prediction.

The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y_true - y_pred)** 2).sum() and \(v\) is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.

Xarray-like of shape (n_samples, n_features)

Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True values for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


\(R^2\) of self.predict(X) w.r.t. y.


The \(R^2\) score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of r2_score. This influences the score method of all the multioutput regressors (except for MultiOutputRegressor).


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.

class wildboar.linear_model.RocketClassifier(n_kernels=10000, *, kernel_size=None, sampling='normal', sampling_params=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize=True, n_jobs=None, random_state=None)[source]#

Implements the ROCKET classifier.


Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.


A MetadataRequest encapsulating routing information.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]#

Return the mean accuracy on the given test data and labels.

In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.

Xarray-like of shape (n_samples, n_features)

Test samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True labels for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


Mean accuracy of self.predict(X) w.r.t. y.


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.

class wildboar.linear_model.RocketRegressor(n_kernels=10000, *, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, gcv_mode=None, n_jobs=None, random_state=None)[source]#

Implements the ROCKET regressor.


Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.


A MetadataRequest encapsulating routing information.


Get parameters for this estimator.

deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.


Parameter names mapped to their values.

score(X, y, sample_weight=None)[source]#

Return the coefficient of determination of the prediction.

The coefficient of determination \(R^2\) is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares ((y_true - y_pred)** 2).sum() and \(v\) is the total sum of squares ((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.

Xarray-like of shape (n_samples, n_features)

Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape (n_samples, n_samples_fitted), where n_samples_fitted is the number of samples used in the fitting for the estimator.

yarray-like of shape (n_samples,) or (n_samples, n_outputs)

True values for X.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.


\(R^2\) of self.predict(X) w.r.t. y.


The \(R^2\) score used when calling score on a regressor uses multioutput='uniform_average' from version 0.23 to keep consistent with default value of r2_score. This influences the score method of all the multioutput regressors (except for MultiOutputRegressor).


Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.


Estimator parameters.

selfestimator instance

Estimator instance.