wildboar.linear_model#
Linear methods for both classification and regression.
Classes#
A dictionary based method using dilated competing shapelets.  | 
|
A dictionary based method using dilated competing shapelets.  | 
|
A classifier that uses random dilated shapelets.  | 
|
A Dictionary based method using convolutional kernels.  | 
|
A classifier that uses random shapelets.  | 
|
A regressor that uses random shapelets.  | 
|
A classifier using Rocket transform.  | 
|
A regressor using Rocket transform.  | 
- class wildboar.linear_model.CastorClassifier(n_groups=64, n_shapelets=8, *, metric='euclidean', metric_params=None, normalize_prob=0.8, shapelet_size=11, lower=0.05, upper=0.1, order=1, soft_min=True, soft_max=False, soft_threshold=True, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize='sparse', random_state=None, n_jobs=None)[source]#
 A dictionary based method using dilated competing shapelets.
- Parameters:
 - n_groupsint, optional
 The number of groups of dilated shapelets.
- n_shapeletsint, optional
 The number of dilated shapelets per group.
- metricstr or callable, optional
 The distance metric
See
_METRICS.keys()for a list of supported metrics.- metric_paramsdict, optional
 Parameters to the metric.
Read more about the parameters in the User guide.
- normalize_probfloat, optional
 The probability of standardizing a shapelet with zero mean and unit standard deviation.
- shapelet_sizeint, optional
 The length of the dilated shapelet.
- lowerfloat, optional
 The lower percentile to draw distance thresholds above.
- upperfloat, optional
 The upper percentile to draw distance thresholds below.
- orderint or array-like, optional
 The order of difference.
If int, half the groups with corresponding shapelets will be convolved with the order discrete difference along the time dimension.
- soft_minbool, optional
 If True, use the sum of minimal distances. Otherwise, use the count of minimal distances.
- soft_maxbool, optional
 If True, use the sum of maximal distances. Otherwise, use the count of maximal distances.
- soft_thresholdbool, optional
 If True, count the time steps below the threshold for all shapelets. Otherwise, count the time steps below the threshold for the shapelet with the minimal distance.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- class_weightdict or ‘balanced’, optional
 Weights associated with classes in the form {class_label: weight}.
- normalize“sparse” or bool, optional
 Standardize before fitting. By default use
datasets.preprocess.SparseScalerto standardize the attributes. Set to False to disable or True to use StandardScaler.- random_stateint or RandomState, optional
 Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomStateinstance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomStateinstance used bynumpy.random.
- n_jobsint, optional
 The number of parallel jobs.
Notes
For better performance with multivariate datasets, set n_shapelets to n_shapelets * n_dims to ensure feature variability.
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return accuracy on provided data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True labels for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 Mean accuracy of
self.predict(X)w.r.t. y.
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- class wildboar.linear_model.CastorRegressor(n_groups=64, n_shapelets=8, *, metric='euclidean', metric_params=None, normalize_prob=0.8, shapelet_size=11, lower=0.05, upper=0.1, order=1, soft_min=True, soft_max=False, soft_threshold=True, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, normalize='sparse', random_state=None, n_jobs=None)[source]#
 A dictionary based method using dilated competing shapelets.
- Parameters:
 - n_groupsint, optional
 The number of groups of dilated shapelets.
- n_shapeletsint, optional
 The number of dilated shapelets per group.
- metricstr or callable, optional
 The distance metric
See
_METRICS.keys()for a list of supported metrics.- metric_paramsdict, optional
 Parameters to the metric.
Read more about the parameters in the User guide.
- normalize_probfloat, optional
 The probability of standardizing a shapelet with zero mean and unit standard deviation.
- shapelet_sizeint, optional
 The length of the dilated shapelet.
- lowerfloat, optional
 The lower percentile to draw distance thresholds above.
- upperfloat, optional
 The upper percentile to draw distance thresholds below.
- orderint or array-like, optional
 The order of difference.
If int, half the groups with corresponding shapelets will be convolved with the order discrete difference along the time dimension.
- soft_minbool, optional
 If True, use the sum of minimal distances. Otherwise, use the count of minimal distances.
- soft_maxbool, optional
 If True, use the sum of maximal distances. Otherwise, use the count of maximal distances.
- soft_thresholdbool, optional
 If True, count the time steps below the threshold for all shapelets. Otherwise, count the time steps below the threshold for the shapelet with the minimal distance.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- normalize“sparse” or bool, optional
 Standardize before fitting. By default use
datasets.preprocess.SparseScalerto standardize the attributes. Set to False to disable or True to use StandardScaler.- random_stateint or RandomState, optional
 Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomStateinstance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomStateinstance used bynumpy.random.
- n_jobsint, optional
 The number of parallel jobs.
Notes
For better performance with multivariate datasets, set n_shapelets to n_shapelets * n_dims to ensure feature variability.
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return coefficient of determination on test data.
The coefficient of determination, \(R^2\), is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True values for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 \(R^2\) of
self.predict(X)w.r.t. y.
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score. This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- class wildboar.linear_model.DilatedShapeletClassifier(n_shapelets=1000, *, metric='euclidean', metric_params=None, normalize_prob=0.8, min_shapelet_size=None, max_shapelet_size=None, shapelet_size=None, lower=0.05, upper=0.1, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize=True, random_state=None, n_jobs=None)[source]#
 A classifier that uses random dilated shapelets.
- Parameters:
 - n_shapeletsint, optional
 The number of dilated shapelets.
- metricstr or callable, optional
 The distance metric
See
_METRICS.keys()for a list of supported metrics.- metric_paramsdict, optional
 Parameters to the metric.
Read more about the parameters in the User guide.
- normalize_probfloat, optional
 The probability of standardizing a shapelet with zero mean and unit standard deviation.
- min_shapelet_sizefloat, optional
 The minimum shapelet size. If None, use the discrete sizes in shapelet_size.
- max_shapelet_sizefloat, optional
 The maximum shapelet size. If None, use the discrete sizes in shapelet_size.
- shapelet_sizearray-like, optional
 The size of shapelets.
- lowerfloat, optional
 The lower percentile to draw distance thresholds above.
- upperfloat, optional
 The upper percentile to draw distance thresholds below.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- class_weightdict or ‘balanced’, optional
 Weights associated with classes in the form {class_label: weight}.
- normalizebool, optional
 Standardize before fitting.
- random_stateint or RandomState, optional
 Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomStateinstance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomStateinstance used bynumpy.random.
- n_jobsint, optional
 The number of parallel jobs.
References
- Antoine Guillaume, Christel Vrain, Elloumi Wael
 Random Dilated Shapelet Transform: A New Approach for Time Series Shapelets Pattern Recognition and Artificial Intelligence, 2022
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return accuracy on provided data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True labels for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 Mean accuracy of
self.predict(X)w.r.t. y.
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- class wildboar.linear_model.HydraClassifier(*, n_groups=64, n_kernels=8, kernel_size=9, sampling='normal', sampling_params=None, order=1, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize='sparse', n_jobs=None, random_state=None)[source]#
 A Dictionary based method using convolutional kernels.
- Parameters:
 - n_groupsint, optional
 The number of groups of kernels.
- n_kernelsint, optional
 The number of kernels per group.
- kernel_sizeint, optional
 The size of the kernel.
- sampling{“normal”}, optional
 The strategy for sampling kernels. By default kernel weights are sampled from a normal distribution with zero mean and unit standard deviation.
- sampling_paramsdict, optional
 Parameters to the sampling approach. The “normal” sampler accepts two parameters: mean and scale.
- orderint, optional
 The order of difference. If set, half the groups with corresponding kernels will be convolved with the order discrete difference along the time dimension.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- class_weightdict or ‘balanced’, optional
 Weights associated with classes in the form {class_label: weight}.
- normalizebool, optional
 Standardize before fitting. By default use
datasets.preprocess.SparseScalerto standardize the attributes. Set to False to disable or True to use StandardScaler.- n_jobsint, optional
 The number of jobs to run in parallel. A value of None means using a single core and a value of -1 means using all cores. Positive integers mean the exact number of cores.
- random_stateint or RandomState, optional
 Controls the random resampling of the original dataset.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomStateinstance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomStateinstance used bynumpy.random.
References
- Dempster, A., Schmidt, D. F., & Webb, G. I. (2023).
 Hydra: competing convolutional kernels for fast and accurate time series classification. Data Mining and Knowledge Discovery
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return accuracy on provided data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True labels for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 Mean accuracy of
self.predict(X)w.r.t. y.
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- class wildboar.linear_model.RandomShapeletClassifier(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.1, max_shapelet_size=1.0, coverage_probability=None, variability=None, alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, class_weight=None, random_state=None, n_jobs=None)[source]#
 A classifier that uses random shapelets.
- Parameters:
 - n_shapeletsint or {“log2”, “sqrt”, “auto”}, optional
 The number of shapelets in the resulting transform.
if, “auto” the number of shapelets depend on the value of strategy. For “best” the number is 1; and for “random” it is 1000.
if, “log2”, the number of shaplets is the log2 of the total possible number of shapelets.
if, “sqrt”, the number of shaplets is the square root of the total possible number of shapelets.
- metricstr or list, optional
 If str, the distance metric used to identify the best shapelet.
- If list, multiple metrics specified as a list of tuples, where the first
 element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specify a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification:
dict(min_r=0, max_r=1, num_r=10).
Read more about the metrics and their parameters in the User guide.
- metric_paramsdict, optional
 Parameters for the distance measure. Ignored unless metric is a string.
Read more about the parameters in the User guide.
- min_shapelet_sizefloat, optional
 Minimum shapelet size.
- max_shapelet_sizefloat, optional
 Maximum shapelet size.
- coverage_probabilityfloat, optional
 The probability that a time step is covered by a shapelet, in the range 0 < coverage_probability <= 1.
For larger coverage_probability, we get larger shapelets.
For smaller coverage_probability, we get shorter shapelets.
- variabilityfloat, optional
 Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1.
Higher variability creates more uniform intervals.
Lower variability creates more variable intervals sizes.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- normalizebool, optional
 Standardize before fitting.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- class_weightdict or ‘balanced’, optional
 Weights associated with classes in the form {class_label: weight}.
- random_stateint or RandomState, optional
 Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomStateinstance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomStateinstance used bynumpy.random.
- n_jobsint, optional
 The number of parallel jobs.
References
- Wistuba, Martin, Josif Grabocka, and Lars Schmidt-Thieme.
 Ultra-fast shapelets for time series classification. arXiv preprint arXiv:1503.05018 (2015).
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return accuracy on provided data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True labels for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 Mean accuracy of
self.predict(X)w.r.t. y.
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- class wildboar.linear_model.RandomShapeletRegressor(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.1, max_shapelet_size=1.0, coverage_probability=None, variability=None, alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, gcv_mode=None, random_state=None, n_jobs=None)[source]#
 A regressor that uses random shapelets.
- Parameters:
 - n_shapeletsint or {“log2”, “sqrt”, “auto”}, optional
 The number of shapelets in the resulting transform.
if, “auto” the number of shapelets depend on the value of strategy. For “best” the number is 1; and for “random” it is 1000.
if, “log2”, the number of shaplets is the log2 of the total possible number of shapelets.
if, “sqrt”, the number of shaplets is the square root of the total possible number of shapelets.
- metricstr or list, optional
 If str, the distance metric used to identify the best shapelet.
- If list, multiple metrics specified as a list of tuples, where the first
 element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specify a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification:
dict(min_r=0, max_r=1, num_r=10).
Read more about the metrics and their parameters in the User guide.
- metric_paramsdict, optional
 Parameters for the distance measure. Ignored unless metric is a string.
Read more about the parameters in the User guide.
- min_shapelet_sizefloat, optional
 Minimum shapelet size.
- max_shapelet_sizefloat, optional
 Maximum shapelet size.
- coverage_probabilityfloat, optional
 The probability that a time step is covered by a shapelet, in the range 0 < coverage_probability <= 1.
For larger coverage_probability, we get larger shapelets.
For smaller coverage_probability, we get shorter shapelets.
- variabilityfloat, optional
 Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1.
Higher variability creates more uniform intervals.
Lower variability creates more variable intervals sizes.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- normalizebool, optional
 Standardize before fitting.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- gcv_mode{‘auto’, ‘svd’, ‘eigen’}, optional
 Flag indicating which strategy to use when performing Leave-One-Out Cross-Validation. Options are:
'auto' : use 'svd' if n_samples > n_features, otherwise use 'eigen' 'svd' : force use of singular value decomposition of X when X is dense, eigenvalue decomposition of X^T.X when X is sparse. 'eigen' : force computation via eigendecomposition of X.X^T
The ‘auto’ mode is the default and is intended to pick the cheaper option of the two depending on the shape of the training data.
- random_stateint or RandomState, optional
 Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomStateinstance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomStateinstance used bynumpy.random.
- n_jobsint, optional
 The number of parallel jobs.
References
- Wistuba, Martin, Josif Grabocka, and Lars Schmidt-Thieme.
 Ultra-fast shapelets for time series classification. arXiv preprint arXiv:1503.05018 (2015).
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return coefficient of determination on test data.
The coefficient of determination, \(R^2\), is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True values for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 \(R^2\) of
self.predict(X)w.r.t. y.
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score. This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- class wildboar.linear_model.RocketClassifier(n_kernels=10000, *, sampling='normal', sampling_params=None, kernel_size=None, min_size=None, max_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize=True, random_state=None, n_jobs=None)[source]#
 A classifier using Rocket transform.
- Parameters:
 - n_kernelsint, optional
 The number of kernels to sample at each node.
- sampling{“normal”, “uniform”, “shapelet”}, optional
 The sampling of convolutional filters.
if “normal”, sample filter according to a normal distribution with
meanandscale.if “uniform”, sample filter according to a uniform distribution with
lowerandupper.if “shapelet”, sample filters as subsequences in the training data.
- sampling_paramsdict, optional
 Parameters for the sampling strategy.
if “normal”,
{"mean": float, "scale": float}, defaults to{"mean": 0, "scale": 1}.if “uniform”,
{"lower": float, "upper": float}, defaults to{"lower": -1, "upper": 1}.
- kernel_sizearray-like, optional
 The kernel size, by default
[7, 11, 13].- min_sizefloat, optional
 The minimum timestep size used for generating kernel sizes, If set,
kernel_sizeis ignored.- max_sizefloat, optional
 The maximum timestep size used for generating kernel sizes, If set,
kernel_sizeis ignored.- bias_probfloat, optional
 The probability of using the bias term.
- normalize_probfloat, optional
 The probability of performing normalization.
- padding_probfloat, optional
 The probability of padding with zeros.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- class_weightdict or ‘balanced’, optional
 Weights associated with classes in the form {class_label: weight}.
- normalize“sparse” or bool, optional
 Standardize before fitting. By default use
datasets.preprocess.SparseScalerto standardize the attributes. Set to False to disable or True to use StandardScaler.- random_stateint or RandomState, optional
 Controls the random resampling of the original dataset.
If
int,random_stateis the seed used by the random number generator.If
numpy.random.RandomStateinstance,random_stateis the random number generator.If
None, the random number generator is thenumpy.random.RandomStateinstance used bynumpy.random.
- n_jobsint, optional
 The number of jobs to run in parallel. A value of
Nonemeans using a single core and a value of-1means using all cores. Positive integers mean the exact number of cores.
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return accuracy on provided data and labels.
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True labels for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 Mean accuracy of
self.predict(X)w.r.t. y.
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.
- class wildboar.linear_model.RocketRegressor(n_kernels=10000, *, sampling='normal', sampling_params=None, kernel_size=None, min_size=None, max_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, gcv_mode=None, normalize=True, random_state=None, n_jobs=None)[source]#
 A regressor using Rocket transform.
- Parameters:
 - n_kernelsint, optional
 The number of kernels to sample at each node.
- sampling{“normal”, “uniform”, “shapelet”}, optional
 The sampling of convolutional filters.
if “normal”, sample filter according to a normal distribution with
meanandscale.if “uniform”, sample filter according to a uniform distribution with
lowerandupper.if “shapelet”, sample filters as subsequences in the training data.
- sampling_paramsdict, optional
 Parameters for the sampling strategy.
if “normal”,
{"mean": float, "scale": float}, defaults to{"mean": 0, "scale": 1}.if “uniform”,
{"lower": float, "upper": float}, defaults to{"lower": -1, "upper": 1}.
- kernel_sizearray-like, optional
 The kernel size, by default
[7, 11, 13].- min_sizefloat, optional
 The minimum timestep size used for generating kernel sizes, If set,
kernel_sizeis ignored.- max_sizefloat, optional
 The maximum timestep size used for generating kernel sizes, If set,
kernel_sizeis ignored.- bias_probfloat, optional
 The probability of using the bias term.
- normalize_probfloat, optional
 The probability of performing normalization.
- padding_probfloat, optional
 The probability of padding with zeros.
- alphasarray-like of shape (n_alphas,), optional
 Array of alpha values to try.
- fit_interceptbool, optional
 Whether to calculate the intercept for this model.
- scoringstr, callable, optional
 A string or a scorer callable object with signature scorer(estimator, X, y).
- cvint, cross-validation generator or an iterable, optional
 Determines the cross-validation splitting strategy.
- gcv_mode{‘auto’, ‘svd’, ‘eigen’}, optional
 Flag indicating which strategy to use when performing Leave-One-Out Cross-Validation. Options are:
'auto' : use 'svd' if n_samples > n_features, otherwise use 'eigen' 'svd' : force use of singular value decomposition of X when X is dense, eigenvalue decomposition of X^T.X when X is sparse. 'eigen' : force computation via eigendecomposition of X.X^T
The ‘auto’ mode is the default and is intended to pick the cheaper option of the two depending on the shape of the training data.
- normalize“sparse” or bool, optional
 Standardize before fitting. By default use
datasets.preprocess.SparseScalerto standardize the attributes. Set to False to disable or True to use StandardScaler.- random_stateint or RandomState, optional
 Controls the random resampling of the original dataset.
If
int,random_stateis the seed used by the random number generator.If
numpy.random.RandomStateinstance,random_stateis the random number generator.If
None, the random number generator is thenumpy.random.RandomStateinstance used bynumpy.random.
- n_jobsint, optional
 The number of jobs to run in parallel. A value of
Nonemeans using a single core and a value of-1means using all cores. Positive integers mean the exact number of cores.
- get_metadata_routing()[source]#
 Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
 - routingMetadataRequest
 A
MetadataRequestencapsulating routing information.
- get_params(deep=True)[source]#
 Get parameters for this estimator.
- Parameters:
 - deepbool, default=True
 If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
 - paramsdict
 Parameter names mapped to their values.
- score(X, y, sample_weight=None)[source]#
 Return coefficient of determination on test data.
The coefficient of determination, \(R^2\), is defined as \((1 - \frac{u}{v})\), where \(u\) is the residual sum of squares
((y_true - y_pred)** 2).sum()and \(v\) is the total sum of squares((y_true - y_true.mean()) ** 2).sum(). The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of y, disregarding the input features, would get a \(R^2\) score of 0.0.- Parameters:
 - Xarray-like of shape (n_samples, n_features)
 Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape
(n_samples, n_samples_fitted), wheren_samples_fittedis the number of samples used in the fitting for the estimator.- yarray-like of shape (n_samples,) or (n_samples, n_outputs)
 True values for X.
- sample_weightarray-like of shape (n_samples,), default=None
 Sample weights.
- Returns:
 - scorefloat
 \(R^2\) of
self.predict(X)w.r.t. y.
Notes
The \(R^2\) score used when calling
scoreon a regressor usesmultioutput='uniform_average'from version 0.23 to keep consistent with default value ofr2_score. This influences thescoremethod of all the multioutput regressors (except forMultiOutputRegressor).
- set_params(**params)[source]#
 Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline). The latter have parameters of the form<component>__<parameter>so that it’s possible to update each component of a nested object.- Parameters:
 - **paramsdict
 Estimator parameters.
- Returns:
 - selfestimator instance
 Estimator instance.