*******************************
:py:mod:`wildboar.linear_model`
*******************************
.. py:module:: wildboar.linear_model
.. autoapi-nested-parse::
Linear methods for both classification and regression.
..
!! processed by numpydoc !!
Classes
-------
.. autoapisummary::
wildboar.linear_model.CastorClassifier
wildboar.linear_model.CastorRegressor
wildboar.linear_model.DilatedShapeletClassifier
wildboar.linear_model.HydraClassifier
wildboar.linear_model.RandomShapeletClassifier
wildboar.linear_model.RandomShapeletRegressor
wildboar.linear_model.RocketClassifier
wildboar.linear_model.RocketRegressor
.. raw:: html
.. py:class:: CastorClassifier(n_groups=64, n_shapelets=8, *, metric='euclidean', metric_params=None, normalize_prob=0.8, shapelet_size=11, lower=0.05, upper=0.1, order=1, soft_min=True, soft_max=False, soft_threshold=True, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize='sparse', random_state=None, n_jobs=None)
A dictionary based method using dilated competing shapelets.
:Parameters:
**n_groups** : int, optional
The number of groups of dilated shapelets.
**n_shapelets** : int, optional
The number of dilated shapelets per group.
**metric** : str or callable, optional
The distance metric
See ``_METRICS.keys()`` for a list of supported metrics.
**metric_params** : dict, optional
Parameters to the metric.
Read more about the parameters in the
:ref:`User guide `.
**normalize_prob** : float, optional
The probability of standardizing a shapelet with zero mean and unit
standard deviation.
**shapelet_size** : int, optional
The length of the dilated shapelet.
**lower** : float, optional
The lower percentile to draw distance thresholds above.
**upper** : float, optional
The upper percentile to draw distance thresholds below.
**order** : int or array-like, optional
The order of difference.
If int, half the groups with corresponding shapelets will be convolved
with the `order` discrete difference along the time dimension.
**soft_min** : bool, optional
If `True`, use the sum of minimal distances. Otherwise, use the count
of minimal distances.
**soft_max** : bool, optional
If `True`, use the sum of maximal distances. Otherwise, use the count
of maximal distances.
**soft_threshold** : bool, optional
If `True`, count the time steps below the threshold for all shapelets.
Otherwise, count the time steps below the threshold for the shapelet
with the minimal distance.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**class_weight** : dict or 'balanced', optional
Weights associated with classes in the form `{class_label: weight}`.
**normalize** : "sparse" or bool, optional
Standardize before fitting. By default use
:class:`datasets.preprocess.SparseScaler` to standardize the attributes. Set
to `False` to disable or `True` to use `StandardScaler`.
**random_state** : int or RandomState, optional
Controls the random sampling of kernels.
- If `int`, `random_state` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, `random_state` is
the random number generator.
- If `None`, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
**n_jobs** : int, optional
The number of parallel jobs.
.. rubric:: Notes
For better performance with multivariate datasets, set `n_shapelets` to
`n_shapelets * n_dims` to ensure feature variability.
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy
which is a harsh metric since you require for each sample that
each label set be correctly predicted.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
Mean accuracy of ``self.predict(X)`` w.r.t. `y`.
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!
.. py:class:: CastorRegressor(n_groups=64, n_shapelets=8, *, metric='euclidean', metric_params=None, normalize_prob=0.8, shapelet_size=11, lower=0.05, upper=0.1, order=1, soft_min=True, soft_max=False, soft_threshold=True, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, normalize='sparse', random_state=None, n_jobs=None)
A dictionary based method using dilated competing shapelets.
:Parameters:
**n_groups** : int, optional
The number of groups of dilated shapelets.
**n_shapelets** : int, optional
The number of dilated shapelets per group.
**metric** : str or callable, optional
The distance metric
See ``_METRICS.keys()`` for a list of supported metrics.
**metric_params** : dict, optional
Parameters to the metric.
Read more about the parameters in the
:ref:`User guide `.
**normalize_prob** : float, optional
The probability of standardizing a shapelet with zero mean and unit
standard deviation.
**shapelet_size** : int, optional
The length of the dilated shapelet.
**lower** : float, optional
The lower percentile to draw distance thresholds above.
**upper** : float, optional
The upper percentile to draw distance thresholds below.
**order** : int or array-like, optional
The order of difference.
If int, half the groups with corresponding shapelets will be convolved
with the `order` discrete difference along the time dimension.
**soft_min** : bool, optional
If `True`, use the sum of minimal distances. Otherwise, use the count
of minimal distances.
**soft_max** : bool, optional
If `True`, use the sum of maximal distances. Otherwise, use the count
of maximal distances.
**soft_threshold** : bool, optional
If `True`, count the time steps below the threshold for all shapelets.
Otherwise, count the time steps below the threshold for the shapelet
with the minimal distance.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**normalize** : "sparse" or bool, optional
Standardize before fitting. By default use
:class:`datasets.preprocess.SparseScaler` to standardize the attributes. Set
to `False` to disable or `True` to use `StandardScaler`.
**random_state** : int or RandomState, optional
Controls the random sampling of kernels.
- If `int`, `random_state` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, `random_state` is
the random number generator.
- If `None`, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
**n_jobs** : int, optional
The number of parallel jobs.
.. rubric:: Notes
For better performance with multivariate datasets, set `n_shapelets` to
`n_shapelets * n_dims` to ensure feature variability.
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the coefficient of determination of the prediction.
The coefficient of determination :math:`R^2` is defined as
:math:`(1 - \frac{u}{v})`, where :math:`u` is the residual
sum of squares ``((y_true - y_pred)** 2).sum()`` and :math:`v`
is the total sum of squares ``((y_true - y_true.mean()) ** 2).sum()``.
The best possible score is 1.0 and it can be negative (because the
model can be arbitrarily worse). A constant model that always predicts
the expected value of `y`, disregarding the input features, would get
a :math:`R^2` score of 0.0.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples. For some estimators this may be a precomputed
kernel matrix or a list of generic objects instead with shape
``(n_samples, n_samples_fitted)``, where ``n_samples_fitted``
is the number of samples used in the fitting for the estimator.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True values for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
:math:`R^2` of ``self.predict(X)`` w.r.t. `y`.
.. rubric:: Notes
The :math:`R^2` score used when calling ``score`` on a regressor uses
``multioutput='uniform_average'`` from version 0.23 to keep consistent
with default value of :func:`~sklearn.metrics.r2_score`.
This influences the ``score`` method of all the multioutput
regressors (except for
:class:`~sklearn.multioutput.MultiOutputRegressor`).
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!
.. py:class:: DilatedShapeletClassifier(n_shapelets=1000, *, metric='euclidean', metric_params=None, normalize_prob=0.8, min_shapelet_size=None, max_shapelet_size=None, shapelet_size=None, lower=0.05, upper=0.1, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize=True, random_state=None, n_jobs=None)
A classifier that uses random dilated shapelets.
:Parameters:
**n_shapelets** : int, optional
The number of dilated shapelets.
**metric** : str or callable, optional
The distance metric
See ``_METRICS.keys()`` for a list of supported metrics.
**metric_params** : dict, optional
Parameters to the metric.
Read more about the parameters in the
:ref:`User guide `.
**normalize_prob** : float, optional
The probability of standardizing a shapelet with zero mean and unit
standard deviation.
**min_shapelet_size** : float, optional
The minimum shapelet size. If None, use the discrete sizes
in `shapelet_size`.
**max_shapelet_size** : float, optional
The maximum shapelet size. If None, use the discrete sizes
in `shapelet_size`.
**shapelet_size** : array-like, optional
The size of shapelets.
**lower** : float, optional
The lower percentile to draw distance thresholds above.
**upper** : float, optional
The upper percentile to draw distance thresholds below.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**class_weight** : dict or 'balanced', optional
Weights associated with classes in the form `{class_label: weight}`.
**normalize** : bool, optional
Standardize before fitting.
**random_state** : int or RandomState, optional
Controls the random sampling of kernels.
- If `int`, `random_state` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, `random_state` is
the random number generator.
- If `None`, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
**n_jobs** : int, optional
The number of parallel jobs.
.. rubric:: References
Antoine Guillaume, Christel Vrain, Elloumi Wael
Random Dilated Shapelet Transform: A New Approach for Time Series Shapelets
Pattern Recognition and Artificial Intelligence, 2022
.. only:: latex
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy
which is a harsh metric since you require for each sample that
each label set be correctly predicted.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
Mean accuracy of ``self.predict(X)`` w.r.t. `y`.
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!
.. py:class:: HydraClassifier(*, n_groups=64, n_kernels=8, kernel_size=9, sampling='normal', sampling_params=None, order=1, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize='sparse', n_jobs=None, random_state=None)
A Dictionary based method using convolutional kernels.
:Parameters:
**n_groups** : int, optional
The number of groups of kernels.
**n_kernels** : int, optional
The number of kernels per group.
**kernel_size** : int, optional
The size of the kernel.
**sampling** : {"normal"}, optional
The strategy for sampling kernels. By default kernel weights
are sampled from a normal distribution with zero mean and unit
standard deviation.
**sampling_params** : dict, optional
Parameters to the sampling approach. The "normal" sampler
accepts two parameters: `mean` and `scale`.
**order** : int, optional
The order of difference. If set, half the groups with corresponding
kernels will be convolved with the `order` discrete difference along
the time dimension.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**class_weight** : dict or 'balanced', optional
Weights associated with classes in the form `{class_label: weight}`.
**normalize** : bool, optional
Standardize before fitting. By default use
:class:`datasets.preprocess.SparseScaler` to standardize the attributes. Set
to `False` to disable or `True` to use `StandardScaler`.
**n_jobs** : int, optional
The number of jobs to run in parallel. A value of `None` means using
a single core and a value of `-1` means using all cores. Positive
integers mean the exact number of cores.
**random_state** : int or RandomState, optional
Controls the random resampling of the original dataset.
- If `int`, `random_state` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, `random_state` is
the random number generator.
- If `None`, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
.. rubric:: References
Dempster, A., Schmidt, D. F., & Webb, G. I. (2023).
Hydra: competing convolutional kernels for fast and accurate
time series classification. Data Mining and Knowledge Discovery
.. only:: latex
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy
which is a harsh metric since you require for each sample that
each label set be correctly predicted.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
Mean accuracy of ``self.predict(X)`` w.r.t. `y`.
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!
.. py:class:: RandomShapeletClassifier(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.1, max_shapelet_size=1.0, coverage_probability=None, variability=None, alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, class_weight=None, random_state=None, n_jobs=None)
A classifier that uses random shapelets.
:Parameters:
**n_shapelets** : int or {"log2", "sqrt", "auto"}, optional
The number of shapelets in the resulting transform.
- if, "auto" the number of shapelets depend on the value of `strategy`.
For "best" the number is 1; and for "random" it is 1000.
- if, "log2", the number of shaplets is the log2 of the total possible
number of shapelets.
- if, "sqrt", the number of shaplets is the square root of the total
possible number of shapelets.
**metric** : str or list, optional
- If str, the distance metric used to identify the best shapelet.
- If list, multiple metrics specified as a list of tuples, where the first
element of the tuple is a metric name and the second element a dictionary
with a parameter grid specification. A parameter grid specification is a
dict with two mandatory and one optional key-value pairs defining the
lower and upper bound on the values and number of values in the grid. For
example, to specify a grid over the argument 'r' with 10 values in the
range 0 to 1, we would give the following specification: ``dict(min_r=0,
max_r=1, num_r=10)``.
Read more about the metrics and their parameters in the
:ref:`User guide `.
**metric_params** : dict, optional
Parameters for the distance measure. Ignored unless metric is a string.
Read more about the parameters in the :ref:`User guide
`.
**min_shapelet_size** : float, optional
Minimum shapelet size.
**max_shapelet_size** : float, optional
Maximum shapelet size.
**coverage_probability** : float, optional
The probability that a time step is covered by a
shapelet, in the range 0 < coverage_probability <= 1.
- For larger `coverage_probability`, we get larger shapelets.
- For smaller `coverage_probability`, we get shorter shapelets.
**variability** : float, optional
Controls the shape of the Beta distribution used to
sample shapelets. Defaults to 1.
- Higher `variability` creates more uniform intervals.
- Lower `variability` creates more variable intervals sizes.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**normalize** : bool, optional
Standardize before fitting.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**class_weight** : dict or 'balanced', optional
Weights associated with classes in the form `{class_label: weight}`.
**random_state** : int or RandomState, optional
Controls the random sampling of kernels.
- If `int`, `random_state` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, `random_state` is
the random number generator.
- If `None`, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
**n_jobs** : int, optional
The number of parallel jobs.
.. rubric:: References
Wistuba, Martin, Josif Grabocka, and Lars Schmidt-Thieme.
Ultra-fast shapelets for time series classification. arXiv preprint
arXiv:1503.05018 (2015).
.. only:: latex
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy
which is a harsh metric since you require for each sample that
each label set be correctly predicted.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
Mean accuracy of ``self.predict(X)`` w.r.t. `y`.
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!
.. py:class:: RandomShapeletRegressor(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.1, max_shapelet_size=1.0, coverage_probability=None, variability=None, alphas=(0.1, 1.0, 10.0), fit_intercept=True, normalize=False, scoring=None, cv=None, gcv_mode=None, random_state=None, n_jobs=None)
A regressor that uses random shapelets.
:Parameters:
**n_shapelets** : int or {"log2", "sqrt", "auto"}, optional
The number of shapelets in the resulting transform.
- if, "auto" the number of shapelets depend on the value of `strategy`.
For "best" the number is 1; and for "random" it is 1000.
- if, "log2", the number of shaplets is the log2 of the total possible
number of shapelets.
- if, "sqrt", the number of shaplets is the square root of the total
possible number of shapelets.
**metric** : str or list, optional
- If str, the distance metric used to identify the best shapelet.
- If list, multiple metrics specified as a list of tuples, where the first
element of the tuple is a metric name and the second element a dictionary
with a parameter grid specification. A parameter grid specification is a
dict with two mandatory and one optional key-value pairs defining the
lower and upper bound on the values and number of values in the grid. For
example, to specify a grid over the argument 'r' with 10 values in the
range 0 to 1, we would give the following specification: ``dict(min_r=0,
max_r=1, num_r=10)``.
Read more about the metrics and their parameters in the
:ref:`User guide `.
**metric_params** : dict, optional
Parameters for the distance measure. Ignored unless metric is a string.
Read more about the parameters in the :ref:`User guide
`.
**min_shapelet_size** : float, optional
Minimum shapelet size.
**max_shapelet_size** : float, optional
Maximum shapelet size.
**coverage_probability** : float, optional
The probability that a time step is covered by a
shapelet, in the range 0 < coverage_probability <= 1.
- For larger `coverage_probability`, we get larger shapelets.
- For smaller `coverage_probability`, we get shorter shapelets.
**variability** : float, optional
Controls the shape of the Beta distribution used to
sample shapelets. Defaults to 1.
- Higher `variability` creates more uniform intervals.
- Lower `variability` creates more variable intervals sizes.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**normalize** : bool, optional
Standardize before fitting.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**gcv_mode** : {'auto', 'svd', 'eigen'}, optional
Flag indicating which strategy to use when performing
Leave-One-Out Cross-Validation. Options are::
'auto' : use 'svd' if n_samples > n_features, otherwise use 'eigen'
'svd' : force use of singular value decomposition of X when X is
dense, eigenvalue decomposition of X^T.X when X is sparse.
'eigen' : force computation via eigendecomposition of X.X^T
The 'auto' mode is the default and is intended to pick the cheaper
option of the two depending on the shape of the training data.
**random_state** : int or RandomState, optional
Controls the random sampling of kernels.
- If `int`, `random_state` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, `random_state` is
the random number generator.
- If `None`, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
**n_jobs** : int, optional
The number of parallel jobs.
.. rubric:: References
Wistuba, Martin, Josif Grabocka, and Lars Schmidt-Thieme.
Ultra-fast shapelets for time series classification. arXiv preprint
arXiv:1503.05018 (2015).
.. only:: latex
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the coefficient of determination of the prediction.
The coefficient of determination :math:`R^2` is defined as
:math:`(1 - \frac{u}{v})`, where :math:`u` is the residual
sum of squares ``((y_true - y_pred)** 2).sum()`` and :math:`v`
is the total sum of squares ``((y_true - y_true.mean()) ** 2).sum()``.
The best possible score is 1.0 and it can be negative (because the
model can be arbitrarily worse). A constant model that always predicts
the expected value of `y`, disregarding the input features, would get
a :math:`R^2` score of 0.0.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples. For some estimators this may be a precomputed
kernel matrix or a list of generic objects instead with shape
``(n_samples, n_samples_fitted)``, where ``n_samples_fitted``
is the number of samples used in the fitting for the estimator.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True values for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
:math:`R^2` of ``self.predict(X)`` w.r.t. `y`.
.. rubric:: Notes
The :math:`R^2` score used when calling ``score`` on a regressor uses
``multioutput='uniform_average'`` from version 0.23 to keep consistent
with default value of :func:`~sklearn.metrics.r2_score`.
This influences the ``score`` method of all the multioutput
regressors (except for
:class:`~sklearn.multioutput.MultiOutputRegressor`).
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!
.. py:class:: RocketClassifier(n_kernels=10000, *, sampling='normal', sampling_params=None, kernel_size=None, min_size=None, max_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, class_weight=None, normalize=True, random_state=None, n_jobs=None)
A classifier using Rocket transform.
:Parameters:
**n_kernels** : int, optional
The number of kernels to sample at each node.
**sampling** : {"normal", "uniform", "shapelet"}, optional
The sampling of convolutional filters.
- if "normal", sample filter according to a normal distribution with
``mean`` and ``scale``.
- if "uniform", sample filter according to a uniform distribution with
``lower`` and ``upper``.
- if "shapelet", sample filters as subsequences in the training data.
**sampling_params** : dict, optional
Parameters for the sampling strategy.
- if "normal", ``{"mean": float, "scale": float}``, defaults to
``{"mean": 0, "scale": 1}``.
- if "uniform", ``{"lower": float, "upper": float}``, defaults to
``{"lower": -1, "upper": 1}``.
**kernel_size** : array-like, optional
The kernel size, by default ``[7, 11, 13]``.
**min_size** : float, optional
The minimum timestep size used for generating kernel sizes, If set,
``kernel_size`` is ignored.
**max_size** : float, optional
The maximum timestep size used for generating kernel sizes, If set,
``kernel_size`` is ignored.
**bias_prob** : float, optional
The probability of using the bias term.
**normalize_prob** : float, optional
The probability of performing normalization.
**padding_prob** : float, optional
The probability of padding with zeros.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**class_weight** : dict or 'balanced', optional
Weights associated with classes in the form `{class_label: weight}`.
**normalize** : "sparse" or bool, optional
Standardize before fitting. By default use
:class:`datasets.preprocess.SparseScaler` to standardize the attributes. Set
to `False` to disable or `True` to use `StandardScaler`.
**random_state** : int or RandomState, optional
Controls the random resampling of the original dataset.
- If ``int``, ``random_state`` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, ``random_state`` is
the random number generator.
- If ``None``, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
**n_jobs** : int, optional
The number of jobs to run in parallel. A value of ``None`` means using
a single core and a value of ``-1`` means using all cores. Positive
integers mean the exact number of cores.
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the mean accuracy on the given test data and labels.
In multi-label classification, this is the subset accuracy
which is a harsh metric since you require for each sample that
each label set be correctly predicted.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True labels for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
Mean accuracy of ``self.predict(X)`` w.r.t. `y`.
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!
.. py:class:: RocketRegressor(n_kernels=10000, *, sampling='normal', sampling_params=None, kernel_size=None, min_size=None, max_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, alphas=(0.1, 1.0, 10.0), fit_intercept=True, scoring=None, cv=None, gcv_mode=None, normalize=True, random_state=None, n_jobs=None)
A regressor using Rocket transform.
:Parameters:
**n_kernels** : int, optional
The number of kernels to sample at each node.
**sampling** : {"normal", "uniform", "shapelet"}, optional
The sampling of convolutional filters.
- if "normal", sample filter according to a normal distribution with
``mean`` and ``scale``.
- if "uniform", sample filter according to a uniform distribution with
``lower`` and ``upper``.
- if "shapelet", sample filters as subsequences in the training data.
**sampling_params** : dict, optional
Parameters for the sampling strategy.
- if "normal", ``{"mean": float, "scale": float}``, defaults to
``{"mean": 0, "scale": 1}``.
- if "uniform", ``{"lower": float, "upper": float}``, defaults to
``{"lower": -1, "upper": 1}``.
**kernel_size** : array-like, optional
The kernel size, by default ``[7, 11, 13]``.
**min_size** : float, optional
The minimum timestep size used for generating kernel sizes, If set,
``kernel_size`` is ignored.
**max_size** : float, optional
The maximum timestep size used for generating kernel sizes, If set,
``kernel_size`` is ignored.
**bias_prob** : float, optional
The probability of using the bias term.
**normalize_prob** : float, optional
The probability of performing normalization.
**padding_prob** : float, optional
The probability of padding with zeros.
**alphas** : array-like of shape (n_alphas,), optional
Array of alpha values to try.
**fit_intercept** : bool, optional
Whether to calculate the intercept for this model.
**scoring** : str, callable, optional
A string or a scorer callable object with signature
`scorer(estimator, X, y)`.
**cv** : int, cross-validation generator or an iterable, optional
Determines the cross-validation splitting strategy.
**gcv_mode** : {'auto', 'svd', 'eigen'}, optional
Flag indicating which strategy to use when performing
Leave-One-Out Cross-Validation. Options are::
'auto' : use 'svd' if n_samples > n_features, otherwise use 'eigen'
'svd' : force use of singular value decomposition of X when X is
dense, eigenvalue decomposition of X^T.X when X is sparse.
'eigen' : force computation via eigendecomposition of X.X^T
The 'auto' mode is the default and is intended to pick the cheaper
option of the two depending on the shape of the training data.
**normalize** : "sparse" or bool, optional
Standardize before fitting. By default use
:class:`datasets.preprocess.SparseScaler` to standardize the attributes. Set
to `False` to disable or `True` to use `StandardScaler`.
**random_state** : int or RandomState, optional
Controls the random resampling of the original dataset.
- If ``int``, ``random_state`` is the seed used by the random number
generator.
- If :class:`numpy.random.RandomState` instance, ``random_state`` is
the random number generator.
- If ``None``, the random number generator is the
:class:`numpy.random.RandomState` instance used by
:func:`numpy.random`.
**n_jobs** : int, optional
The number of jobs to run in parallel. A value of ``None`` means using
a single core and a value of ``-1`` means using all cores. Positive
integers mean the exact number of cores.
..
!! processed by numpydoc !!
.. py:method:: get_metadata_routing()
Get metadata routing of this object.
Please check :ref:`User Guide ` on how the routing
mechanism works.
:Returns:
**routing** : MetadataRequest
A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating
routing information.
..
!! processed by numpydoc !!
.. py:method:: get_params(deep=True)
Get parameters for this estimator.
:Parameters:
**deep** : bool, default=True
If True, will return the parameters for this estimator and
contained subobjects that are estimators.
:Returns:
**params** : dict
Parameter names mapped to their values.
..
!! processed by numpydoc !!
.. py:method:: score(X, y, sample_weight=None)
Return the coefficient of determination of the prediction.
The coefficient of determination :math:`R^2` is defined as
:math:`(1 - \frac{u}{v})`, where :math:`u` is the residual
sum of squares ``((y_true - y_pred)** 2).sum()`` and :math:`v`
is the total sum of squares ``((y_true - y_true.mean()) ** 2).sum()``.
The best possible score is 1.0 and it can be negative (because the
model can be arbitrarily worse). A constant model that always predicts
the expected value of `y`, disregarding the input features, would get
a :math:`R^2` score of 0.0.
:Parameters:
**X** : array-like of shape (n_samples, n_features)
Test samples. For some estimators this may be a precomputed
kernel matrix or a list of generic objects instead with shape
``(n_samples, n_samples_fitted)``, where ``n_samples_fitted``
is the number of samples used in the fitting for the estimator.
**y** : array-like of shape (n_samples,) or (n_samples, n_outputs)
True values for `X`.
**sample_weight** : array-like of shape (n_samples,), default=None
Sample weights.
:Returns:
**score** : float
:math:`R^2` of ``self.predict(X)`` w.r.t. `y`.
.. rubric:: Notes
The :math:`R^2` score used when calling ``score`` on a regressor uses
``multioutput='uniform_average'`` from version 0.23 to keep consistent
with default value of :func:`~sklearn.metrics.r2_score`.
This influences the ``score`` method of all the multioutput
regressors (except for
:class:`~sklearn.multioutput.MultiOutputRegressor`).
..
!! processed by numpydoc !!
.. py:method:: set_params(**params)
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects
(such as :class:`~sklearn.pipeline.Pipeline`). The latter have
parameters of the form ``__`` so that it's
possible to update each component of a nested object.
:Parameters:
**\*\*params** : dict
Estimator parameters.
:Returns:
**self** : estimator instance
Estimator instance.
..
!! processed by numpydoc !!