*********************** :py:mod:`wildboar.tree` *********************** .. py:module:: wildboar.tree .. autoapi-nested-parse:: Tree-based estimators for classification and regression. .. !! processed by numpydoc !! Classes ------- .. autoapisummary:: wildboar.tree.ExtraShapeletTreeClassifier wildboar.tree.ExtraShapeletTreeRegressor wildboar.tree.IntervalTreeClassifier wildboar.tree.IntervalTreeRegressor wildboar.tree.PivotTreeClassifier wildboar.tree.ProximityTreeClassifier wildboar.tree.RocketTreeClassifier wildboar.tree.RocketTreeRegressor wildboar.tree.ShapeletTreeClassifier wildboar.tree.ShapeletTreeRegressor Functions --------- .. autoapisummary:: wildboar.tree.plot_tree .. raw:: html
.. py:class:: ExtraShapeletTreeClassifier(*, n_shapelets=1, max_depth=None, min_samples_leaf=1, min_impurity_decrease=0.0, min_samples_split=2, min_shapelet_size=0.0, max_shapelet_size=1.0, coverage_probability=None, variability=1, metric='euclidean', metric_params=None, criterion='entropy', class_weight=None, random_state=None) An extra shapelet tree classifier. Extra shapelet trees are constructed by sampling a distance threshold uniformly in the range `[min(dist), max(dist)]`. :Parameters: **n_shapelets** : int, optional The number of shapelets to sample at each node. **max_depth** : int, optional The maximum depth of the tree. If `None` the tree is expanded until all leaves are pure or until all leaves contain less than `min_samples_split` samples. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **min_impurity_decrease** : float, optional A split will be introduced only if the impurity decrease is larger than or equal to this value. **min_samples_split** : int, optional The minimum number of samples to split an internal node. **min_shapelet_size** : float, optional The minimum length of a sampled shapelet expressed as a fraction, computed as `min(ceil(X.shape[-1] * min_shapelet_size), 2)`. **max_shapelet_size** : float, optional The maximum length of a sampled shapelet, expressed as a fraction, computed as `ceil(X.shape[-1] * max_shapelet_size)`. **coverage_probability** : float, optional The probability that a time step is covered by a shapelet, in the range 0 < coverage_probability <= 1. - For larger `coverage_probability`, we get larger shapelets. - For smaller `coverage_probability`, we get shorter shapelets. **variability** : float, optional Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1. - Higher `variability` creates more uniform intervals. - Lower `variability` creates more variable intervals sizes. **metric** : {"euclidean", "scaled_euclidean", "dtw", "scaled_dtw"}, optional Distance metric used to identify the best shapelet. **metric_params** : dict, optional Parameters for the distance measure. **criterion** : {"entropy", "gini"}, optional The criterion used to evaluate the utility of a split. **class_weight** : dict or "balanced", optional Weights associated with the labels - if dict, weights on the form {label: weight} - if "balanced" each class weight inversely proportional to the class frequency - if None, each class has equal weight. **random_state** : int or RandomState, optional - If `int`, `random_state` is the seed used by the random number generator; - If `RandomState` instance, `random_state` is the random number generator; - If `None`, the random number generator is the `RandomState` instance used by `np.random`. :Attributes: **tree_** : Tree The tree representation .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit a classification tree. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) The target values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This instance. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the regression of the input samples x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: predict_proba(x, check_input=True) Predict class probabilities of the input samples X. The predicted class probability is the fraction of samples of the same class in a leaf. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples, n_classes) The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float Mean accuracy of ``self.predict(X)`` w.r.t. `y`. .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: ExtraShapeletTreeRegressor(*, n_shapelets=1, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, min_shapelet_size=0.0, max_shapelet_size=1.0, coverage_probability=None, variability=1, metric='euclidean', metric_params=None, criterion='squared_error', random_state=None) An extra shapelet tree regressor. Extra shapelet trees are constructed by sampling a distance threshold uniformly in the range [min(dist), max(dist)]. :Parameters: **n_shapelets** : int, optional The number of shapelets to sample at each node. **max_depth** : int, optional The maximum depth of the tree. If `None` the tree is expanded until all leaves are pure or until all leaves contain less than `min_samples_split` samples. **min_samples_split** : int, optional The minimum number of samples to split an internal node. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **criterion** : {"squared_error"}, optional The criterion used to evaluate the utility of a split. .. deprecated:: 1.1 Criterion "mse" was deprecated in v1.1 and removed in version 1.2. **min_impurity_decrease** : float, optional A split will be introduced only if the impurity decrease is larger than or equal to this value. **min_shapelet_size** : float, optional The minimum length of a sampled shapelet expressed as a fraction, computed as `min(ceil(X.shape[-1] * min_shapelet_size), 2)`. **max_shapelet_size** : float, optional The maximum length of a sampled shapelet, expressed as a fraction, computed as `ceil(X.shape[-1] * max_shapelet_size)`. **coverage_probability** : float, optional The probability that a time step is covered by a shapelet, in the range 0 < coverage_probability <= 1. - For larger `coverage_probability`, we get larger shapelets. - For smaller `coverage_probability`, we get shorter shapelets. **variability** : float, optional Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1. - Higher `variability` creates more uniform intervals. - Lower `variability` creates more variable intervals sizes. **metric** : {'euclidean', 'scaled_euclidean', 'scaled_dtw'}, optional Distance metric used to identify the best shapelet. **metric_params** : dict, optional Parameters for the distance measure. **random_state** : int or RandomState - If `int`, `random_state` is the seed used by the random number generator; - If `RandomState` instance, `random_state` is the random number generator; - If `None`, the random number generator is the `RandomState` instance used by `np.random`. :Attributes: **tree_** : Tree The internal tree representation .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit the estimator. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) Target values as floating point values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This object. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the value of x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the coefficient of determination of the prediction. The coefficient of determination :math:`R^2` is defined as :math:`(1 - \frac{u}{v})`, where :math:`u` is the residual sum of squares ``((y_true - y_pred)** 2).sum()`` and :math:`v` is the total sum of squares ``((y_true - y_true.mean()) ** 2).sum()``. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of `y`, disregarding the input features, would get a :math:`R^2` score of 0.0. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape ``(n_samples, n_samples_fitted)``, where ``n_samples_fitted`` is the number of samples used in the fitting for the estimator. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float :math:`R^2` of ``self.predict(X)`` w.r.t. `y`. .. rubric:: Notes The :math:`R^2` score used when calling ``score`` on a regressor uses ``multioutput='uniform_average'`` from version 0.23 to keep consistent with default value of :func:`~sklearn.metrics.r2_score`. This influences the ``score`` method of all the multioutput regressors (except for :class:`~sklearn.multioutput.MultiOutputRegressor`). .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: IntervalTreeClassifier(n_intervals='sqrt', *, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, criterion='entropy', intervals='fixed', sample_size=None, min_size=0.0, max_size=1.0, coverage_probability=None, variability=1, summarizer='mean_var_slope', class_weight=None, random_state=None) An interval based tree classifier. :Parameters: **n_intervals** : {"log", "sqrt"}, int or float, optional The number of intervals to partition the time series into. - if "log", the number of intervals is `log2(n_timestep)`. - if "sqrt", the number of intervals is `sqrt(n_timestep)`. - if int, the number of intervals is `n_intervals`. - if float, the number of intervals is `n_intervals * n_timestep`, with `0 < n_intervals < 1`. **max_depth** : int, optional The maximum depth of the tree. If `None` the tree is expanded until all leaves are pure or until all leaves contain less than `min_samples_split` samples. **min_samples_split** : int, optional The minimum number of samples to split an internal node. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **min_impurity_decrease** : float, optional A split will be introduced only if the impurity decrease is larger than or equal to this value. **criterion** : {"entropy", "gini"}, optional The criterion used to evaluate the utility of a split. **intervals** : {"fixed", "sample", "random"}, optional - if "fixed", `n_intervals` non-overlapping intervals. - if "sample", `n_intervals * sample_size` non-overlapping intervals. - if "random", `n_intervals` possibly overlapping intervals of randomly sampled in `[min_size * n_timestep, max_size * n_timestep]`. **sample_size** : float, optional The fraction of intervals to sample at each node. Ignored unless `intervals="sample"`. **min_size** : float, optional The minimum interval size if `intervals="random"`. Ignored if `coverage_probability` is set. **max_size** : float, optional The maximum interval size if `intervals="random"`. Ignored if `coverage_probability` is set. **coverage_probability** : float, optional The probability that a time step is covered by an interval, in the range 0 < coverage_probability <= 1. - For larger `coverage_probability`, we get larger intervals. - For smaller `coverage_probability`, we get shorter intervals. **variability** : float, optional Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1. - Higher `variability` creates more uniform intervals. - Lower `variability` creates more variable intervals sizes. **summarizer** : str or list, optional The method to summarize each interval. - if str, the summarizer is determined by `_SUMMARIZERS.keys()`. - if list, the summarizer is a list of functions `f(x) -> float`, where `x` is a numpy array. The default summarizer summarizes each interval as its mean, variance and slope. **class_weight** : dict or "balanced", optional Weights associated with the labels - if dict, weights on the form {label: weight} - if "balanced" each class weight inversely proportional to the class frequency - if None, each class has equal weight. **random_state** : int or RandomState, optional - If `int`, `random_state` is the seed used by the random number generator - If `RandomState` instance, `random_state` is the random number generator - If `None`, the random number generator is the `RandomState` instance used by `np.random`. :Attributes: **tree_** : Tree The internal tree structure. .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit a classification tree. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) The target values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This instance. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the regression of the input samples x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: predict_proba(x, check_input=True) Predict class probabilities of the input samples X. The predicted class probability is the fraction of samples of the same class in a leaf. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples, n_classes) The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float Mean accuracy of ``self.predict(X)`` w.r.t. `y`. .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: IntervalTreeRegressor(n_intervals='sqrt', *, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, criterion='squared_error', intervals='fixed', sample_size=None, min_size=0.0, max_size=1.0, coverage_probability=None, variability=1, summarizer='mean_var_slope', random_state=None) An interval based tree regressor. :Parameters: **n_intervals** : {"log", "sqrt"}, int or float, optional The number of intervals to partition the time series into. - if "log", the number of intervals is `log2(n_timestep)`. - if "sqrt", the number of intervals is `sqrt(n_timestep)`. - if int, the number of intervals is `n_intervals`. - if float, the number of intervals is `n_intervals * n_timestep`, with `0 < n_intervals < 1`. **max_depth** : int, optional The maximum depth of the tree. If `None` the tree is expanded until all leaves are pure or until all leaves contain less than `min_samples_split` samples. **min_samples_split** : int, optional The minimum number of samples to split an internal node. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **min_impurity_decrease** : float, optional A split will be introduced only if the impurity decrease is larger than or equal to this value. **criterion** : {"entropy", "gini"}, optional The criterion used to evaluate the utility of a split. **intervals** : {"fixed", "sample", "random"}, optional - if "fixed", `n_intervals` non-overlapping intervals. - if "sample", `n_intervals * sample_size` non-overlapping intervals. - if "random", `n_intervals` possibly overlapping intervals of randomly sampled in `[min_size * n_timestep, max_size * n_timestep]`. **sample_size** : float, optional The fraction of intervals to sample at each node. Ignored unless `intervals="sample"`. **min_size** : float, optional The minimum interval size if `intervals="random"`. Ignored if `coverage_probability` is set. **max_size** : float, optional The maximum interval size if `intervals="random"`. Ignored if `coverage_probability` is set. **coverage_probability** : float, optional The probability that a time step is covered by an interval, in the range 0 < coverage_probability <= 1. - For larger `coverage_probability`, we get larger intervals. - For smaller `coverage_probability`, we get shorter intervals. **variability** : float, optional Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1. - Higher `variability` creates more uniform intervals. - Lower `variability` creates more variable intervals sizes. **summarizer** : str or list, optional The method to summarize each interval. - if str, the summarizer is determined by `_SUMMARIZERS.keys()`. - if list, the summarizer is a list of functions `f(x) -> float`, where `x` is a numpy array. The default summarizer summarizes each interval as its mean, variance and slope. **random_state** : int or RandomState, optional - If `int`, `random_state` is the seed used by the random number generator - If `RandomState` instance, `random_state` is the random number generator - If `None`, the random number generator is the `RandomState` instance used by `np.random`. :Attributes: **tree_** : Tree The internal tree structure. .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit the estimator. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) Target values as floating point values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This object. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the value of x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the coefficient of determination of the prediction. The coefficient of determination :math:`R^2` is defined as :math:`(1 - \frac{u}{v})`, where :math:`u` is the residual sum of squares ``((y_true - y_pred)** 2).sum()`` and :math:`v` is the total sum of squares ``((y_true - y_true.mean()) ** 2).sum()``. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of `y`, disregarding the input features, would get a :math:`R^2` score of 0.0. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape ``(n_samples, n_samples_fitted)``, where ``n_samples_fitted`` is the number of samples used in the fitting for the estimator. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float :math:`R^2` of ``self.predict(X)`` w.r.t. `y`. .. rubric:: Notes The :math:`R^2` score used when calling ``score`` on a regressor uses ``multioutput='uniform_average'`` from version 0.23 to keep consistent with default value of :func:`~sklearn.metrics.r2_score`. This influences the ``score`` method of all the multioutput regressors (except for :class:`~sklearn.multioutput.MultiOutputRegressor`). .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: PivotTreeClassifier(n_pivot='sqrt', *, metrics='all', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, impurity_equality_tolerance=None, criterion='entropy', class_weight=None, random_state=None) A tree classifier that uses pivot time series. :Parameters: **n_pivot** : str or int, optional The number of pivot time series to sample at each node. **metrics** : str, optional The metrics to sample from. Currently, we only support "all". **max_depth** : int, optional The maximum depth of the tree. If `None` the tree is expanded until all leaves are pure or until all leaves contain less than `min_samples_split` samples. **min_samples_split** : int, optional The minimum number of samples to split an internal node. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **min_impurity_decrease** : float, optional A split will be introduced only if the impurity decrease is larger than or equal to this value. **impurity_equality_tolerance** : float, optional Tolerance for considering two impurities as equal. If the impurity decrease is the same, we consider the split that maximizes the gap between the sum of distances. - If None, we never consider the separation gap. .. versionadded:: 1.3 **criterion** : {"entropy", "gini"}, optional The criterion used to evaluate the utility of a split. **class_weight** : dict or "balanced", optional Weights associated with the labels. - if dict, weights on the form {label: weight}. - if "balanced" each class weight inversely proportional to the class frequency. - if None, each class has equal weight. **random_state** : int or RandomState - If `int`, `random_state` is the seed used by the random number generator - If `RandomState` instance, `random_state` is the random number generator - If `None`, the random number generator is the `RandomState` instance used by `np.random`. :Attributes: **tree_** : Tree The internal tree representation .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit a classification tree. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) The target values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This instance. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the regression of the input samples x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: predict_proba(x, check_input=True) Predict class probabilities of the input samples X. The predicted class probability is the fraction of samples of the same class in a leaf. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples, n_classes) The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float Mean accuracy of ``self.predict(X)`` w.r.t. `y`. .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: ProximityTreeClassifier(n_pivot=1, *, criterion='entropy', pivot_sample='label', metric_sample='weighted', metric='auto', metric_params=None, metric_factories=None, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, class_weight=None, random_state=None) A classifier that uses a k-branching tree based on pivot-time series. :Parameters: **n_pivot** : int, optional The number of pivots to sample at each node. **criterion** : {"entropy", "gini"}, optional The impurity criterion. **pivot_sample** : {"label", "uniform"}, optional The pivot sampling method. **metric_sample** : {"uniform", "weighted"}, optional The metric sampling method. **metric** : {"auto"}, str or list, optional The distance metrics. By default, we use the parameterization suggested by Lucas et.al (2019). - If "auto", use the default metric specification, suggested by (Lucas et. al, 2020). - If str, use a single metric or default metric specification. - If list, custom metric specification can be given as a list of tuples, where the first element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a `dict` with two mandatory and one optional key-value pairs defining the lower and upper bound on the values as well as the number of values in the grid. For example, to specifiy a grid over the argument 'r' with 10 values in the range 0 to 1, we would give the following specification: `dict(min_r=0, max_r=1, num_r=10)`. Read more about the metrics and their parameters in the :ref:`User guide `. **metric_params** : dict, optional Parameters for the distance measure. Ignored unless metric is a string. Read more about the parameters in the :ref:`User guide `. **metric_factories** : dict, optional A metric specification. .. deprecated:: 1.2 Use the combination of metric and metric params. **max_depth** : int, optional The maximum tree depth. **min_samples_split** : int, optional The minimum number of samples to consider a split. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **min_impurity_decrease** : float, optional The minimum impurity decrease to build a sub-tree. **class_weight** : dict or "balanced", optional Weights associated with the labels. - if dict, weights on the form {label: weight}. - if "balanced" each class weight inversely proportional to the class frequency. - if None, each class has equal weight. **random_state** : int or RandomState - If `int`, `random_state` is the seed used by the random number generator - If `RandomState` instance, `random_state` is the random number generator - If `None`, the random number generator is the `RandomState` instance used by `np.random`. .. rubric:: References Lucas, Benjamin, Ahmed Shifaz, Charlotte Pelletier, Lachlan O'Neill, Nayyar Zaidi, Bart Goethals, François Petitjean, and Geoffrey I. Webb. (2019) Proximity forest: an effective and scalable distance-based classifier for time series. Data Mining and Knowledge Discovery .. only:: latex .. rubric:: Examples Fit a single proximity tree, with dynamic time warping and move-split-merge metrics. >>> from wildboar.datasets import load_dataset >>> from wildboar.tree import ProximityTreeClassifier >>> x, y = load_dataset("GunPoint") >>> f = ProximityTreeClassifier( ... n_pivot=10, ... metrics=[ ... ("dtw", {"min_r": 0.1, "max_r": 0.25}), ... ("msm", {"min_c": 0.1, "max_c": 100, "num_c": 20}) ... ], ... criterion="gini" ... ) >>> f.fit(x, y) .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit a classification tree. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) The target values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This instance. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the regression of the input samples x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: predict_proba(x, check_input=True) Predict class probabilities of the input samples X. The predicted class probability is the fraction of samples of the same class in a leaf. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples, n_classes) The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float Mean accuracy of ``self.predict(X)`` w.r.t. `y`. .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: RocketTreeClassifier(n_kernels=10, *, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, criterion='entropy', sampling='normal', sampling_params=None, kernel_size=None, min_size=None, max_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, class_weight=None, random_state=None) A tree classifier that uses random convolutions as features. :Attributes: **tree_** : Tree The internal tree representation. .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit a classification tree. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) The target values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This instance. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the regression of the input samples x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: predict_proba(x, check_input=True) Predict class probabilities of the input samples X. The predicted class probability is the fraction of samples of the same class in a leaf. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples, n_classes) The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float Mean accuracy of ``self.predict(X)`` w.r.t. `y`. .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: RocketTreeRegressor(n_kernels=10, *, max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, criterion='squared_error', sampling='normal', sampling_params=None, kernel_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, random_state=None) A tree regressor that uses random convolutions as features. :Attributes: **tree_** : Tree The internal tree representation. .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit the estimator. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) Target values as floating point values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This object. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the value of x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the coefficient of determination of the prediction. The coefficient of determination :math:`R^2` is defined as :math:`(1 - \frac{u}{v})`, where :math:`u` is the residual sum of squares ``((y_true - y_pred)** 2).sum()`` and :math:`v` is the total sum of squares ``((y_true - y_true.mean()) ** 2).sum()``. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of `y`, disregarding the input features, would get a :math:`R^2` score of 0.0. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape ``(n_samples, n_samples_fitted)``, where ``n_samples_fitted`` is the number of samples used in the fitting for the estimator. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float :math:`R^2` of ``self.predict(X)`` w.r.t. `y`. .. rubric:: Notes The :math:`R^2` score used when calling ``score`` on a regressor uses ``multioutput='uniform_average'`` from version 0.23 to keep consistent with default value of :func:`~sklearn.metrics.r2_score`. This influences the ``score`` method of all the multioutput regressors (except for :class:`~sklearn.multioutput.MultiOutputRegressor`). .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: ShapeletTreeClassifier(*, n_shapelets='log2', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, impurity_equality_tolerance=None, strategy='warn', shapelet_size=0.1, sample_size=1.0, min_shapelet_size=0.0, max_shapelet_size=1.0, coverage_probability=None, variability=1, alpha=None, metric='euclidean', metric_params=None, criterion='entropy', class_weight=None, random_state=None) A shapelet tree classifier. :Parameters: **n_shapelets** : int or {"log2", "sqrt", "auto"}, optional The number of shapelets in the resulting transform. - if, "auto" the number of shapelets depend on the value of `strategy`. For "best" the number is 1; and for "random" it is 1000. - if, "log2", the number of shaplets is the log2 of the total possible number of shapelets. - if, "sqrt", the number of shaplets is the square root of the total possible number of shapelets. **max_depth** : int, optional The maximum depth of the tree. If `None` the tree is expanded until all leaves are pure or until all leaves contain less than `min_samples_split` samples. **min_samples_split** : int, optional The minimum number of samples to split an internal node. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **min_impurity_decrease** : float, optional A split will be introduced only if the impurity decrease is larger than or equal to this value. **impurity_equality_tolerance** : float, optional Tolerance for considering two impurities as equal. If the impurity decrease is the same, we consider the split that maximizes the gap between the sum of distances. - If None, we never consider the separation gap. .. versionadded:: 1.3 **strategy** : {"best", "random"}, optional The strategy for selecting shapelets. - If "random", `n_shapelets` shapelets are randomly selected in the range defined by `min_shapelet_size` and `max_shapelet_size` - If "best", `n_shapelets` shapelets are selected per input sample of the size determined by `shapelet_size`. .. versionadded:: 1.3 Add support for the "best" strategy. The default will change to "best" in 1.4. **shapelet_size** : int, float or array-like, optional The shapelet size if `strategy="best"`. - If int, the exact shapelet size. - If float, a fraction of the number of input timestep. - If array-like, a list of float or int. .. versionadded:: 1.3 **sample_size** : float, optional The size of the sample to determine the shapelets, if `shapelet_size="best"`. .. versionadded:: 1.3 **min_shapelet_size** : float, optional The minimum length of a sampled shapelet expressed as a fraction, computed as `min(ceil(X.shape[-1] * min_shapelet_size), 2)`. **max_shapelet_size** : float, optional The maximum length of a sampled shapelet, expressed as a fraction, computed as `ceil(X.shape[-1] * max_shapelet_size)`. **coverage_probability** : float, optional The probability that a time step is covered by a shapelet, in the range 0 < coverage_probability <= 1. - For larger `coverage_probability`, we get larger shapelets. - For smaller `coverage_probability`, we get shorter shapelets. **variability** : float, optional Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1. - Higher `variability` creates more uniform intervals. - Lower `variability` creates more variable intervals sizes. **alpha** : float, optional Dynamically decrease the number of sampled shapelets at each node according to the current depth. .. math:`w = 1 - e^{-|alpha| * depth})` - if :math:`alpha < 0`, the number of sampled shapelets decrease from `n_shapelets` towards 1 with increased depth. - if :math:`alpha > 0`, the number of sampled shapelets increase from `1` towards `n_shapelets` with increased depth. - if `None`, the number of sampled shapelets are the same independent of depth. **metric** : str or list, optional - If `str`, the distance metric used to identify the best shapelet. - If `list`, multiple metrics specified as a list of tuples, where the first element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specify a grid over the argument `r` with 10 values in the range 0 to 1, we would give the following specification: `dict(min_r=0, max_r=1, num_r=10)`. Read more about metric specifications in the `User guide `__. .. versionchanged:: 1.2 Added support for multi-metric shapelet transform **metric_params** : dict, optional Parameters for the distance measure. Ignored unless metric is a string. Read more about the parameters in the `User guide `__. **criterion** : {"entropy", "gini"}, optional The criterion used to evaluate the utility of a split. **class_weight** : dict or "balanced", optional Weights associated with the labels - if dict, weights on the form {label: weight} - if "balanced" each class weight inversely proportional to the class frequency - if None, each class has equal weight. **random_state** : int or RandomState - If `int`, `random_state` is the seed used by the random number generator; - If `RandomState` instance, `random_state` is the random number generator; - If `None`, the random number generator is the `RandomState` instance used by `np.random`. :Attributes: **tree_** : Tree The tree data structure used internally **classes_** : ndarray of shape (n_classes,) The class labels **n_classes_** : int The number of class labels .. seealso:: :obj:`ShapeletTreeRegressor` A shapelet tree regressor. :obj:`ExtraShapeletTreeClassifier` An extra random shapelet tree classifier. .. rubric:: Notes When `strategy` is set to `"best"`, the shapelet tree is constructed by selecting the top `n_shapelets` per sample. The initial construction of the matrix profile for each sample may be computationally intensive for large datasets. To balance accuracy and computational efficiency, the `sample_size` parameter can be adjusted to determine the number of samples utilized to compute the minimum distance annotation. The significance of shapelets is determined by the difference between the ab-join of a label with any other label and the self-join of the label, selecting the shapelets with the greatest absolute values. This method is detailed in the work of Zhu et al. (2020). When `strategy` is set to `"random"`, the shapelet tree is constructed by randomly sampling `n_shapelets` within the range defined by `min_shapelet_size` and `max_shapelet_size`. This method is detailed in the work of Karlsson et al. (2016). Alternatively, shapelets can be sampled with a specified `coverage_probability` and `variability`. By specifying a coverage probability, we define the probability of including a point in the extracted shapelet. If `coverage_probability` is set, `min_shapelet_size` and `max_shapelet_size` are ignored. .. rubric:: References Zhu, Y., et al. 2020. The Swiss army knife of time series data mining: ten useful things you can do with the matrix profile and ten lines of code. Data Mining and Knowledge Discovery, 34, pp.949-979. Karlsson, I., Papapetrou, P. and Boström, H., 2016. Generalized random shapelet forests. Data mining and knowledge discovery, 30, pp.1053-1085. .. only:: latex .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit a classification tree. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) The target values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This instance. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the regression of the input samples x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: predict_proba(x, check_input=True) Predict class probabilities of the input samples X. The predicted class probability is the fraction of samples of the same class in a leaf. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples, n_classes) The class probabilities of the input samples. The order of the classes corresponds to that in the attribute `classes_`. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True labels for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float Mean accuracy of ``self.predict(X)`` w.r.t. `y`. .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:class:: ShapeletTreeRegressor(*, n_shapelets='log2', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_impurity_decrease=0.0, impurity_equality_tolerance=None, strategy='warn', shapelet_size=0.1, sample_size=1.0, min_shapelet_size=0, max_shapelet_size=1, coverage_probability=None, variability=1, alpha=None, metric='euclidean', metric_params=None, criterion='squared_error', random_state=None) A shapelet tree regressor. :Parameters: **n_shapelets** : int, optional The number of shapelets to sample at each node. **max_depth** : int, optional The maximum depth of the tree. If `None` the tree is expanded until all leaves are pure or until all leaves contain less than `min_samples_split` samples. **min_samples_split** : int, optional The minimum number of samples to split an internal node. **min_samples_leaf** : int, optional The minimum number of samples in a leaf. **min_impurity_decrease** : float, optional A split will be introduced only if the impurity decrease is larger than or equal to this value. **impurity_equality_tolerance** : float, optional Tolerance for considering two impurities as equal. If the impurity decrease is the same, we consider the split that maximizes the gap between the sum of distances. - If None, we never consider the separation gap. .. versionadded:: 1.3 **strategy** : {"best", "random"}, optional The strategy for selecting shapelets. - If "random", `n_shapelets` shapelets are randomly selected in the range defined by `min_shapelet_size` and `max_shapelet_size` - If "best", `n_shapelets` shapelets are selected per input sample of the size determined by `shapelet_size`. .. versionadded:: 1.3 Add support for the "best" strategy. The default will change to "best" in 1.4. **shapelet_size** : int, float or array-like, optional The shapelet size if `strategy="best"`. - If int, the exact shapelet size. - If float, a fraction of the number of input timestep. - If array-like, a list of float or int. .. versionadded:: 1.3 **sample_size** : float, optional The size of the sample to determine the shapelets, if `shapelet_size="best"`. .. versionadded:: 1.3 **min_shapelet_size** : float, optional The minimum length of a shapelets expressed as a fraction of *n_timestep*. **max_shapelet_size** : float, optional The maximum length of a shapelets expressed as a fraction of *n_timestep*. **coverage_probability** : float, optional The probability that a time step is covered by a shapelet, in the range 0 < coverage_probability <= 1. - For larger `coverage_probability`, we get larger shapelets. - For smaller `coverage_probability`, we get shorter shapelets. **variability** : float, optional Controls the shape of the Beta distribution used to sample shapelets. Defaults to 1. - Higher `variability` creates more uniform intervals. - Lower `variability` creates more variable intervals sizes. **alpha** : float, optional Dynamically decrease the number of sampled shapelets at each node according to the current depth, i.e.: :: w = 1 - exp(-abs(alpha) * depth) - if `alpha < 0`, the number of sampled shapelets decrease from `n_shapelets` towards 1 with increased depth. - if `alpha > 0`, the number of sampled shapelets increase from `1` towards `n_shapelets` with increased depth. - if `None`, the number of sampled shapelets are the same independent of depth. **metric** : str or list, optional - If `str`, the distance metric used to identify the best shapelet. - If `list`, multiple metrics specified as a list of tuples, where the first element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specify a grid over the argument `r` with 10 values in the range 0 to 1, we would give the following specification: `dict(min_r=0, max_r=1, num_r=10)`. Read more about metric specifications in the `User guide `__. .. versionchanged:: 1.2 Added support for multi-metric shapelet transform **metric_params** : dict, optional Parameters for the distance measure. Ignored unless metric is a string. Read more about the parameters in the `User guide `__. **criterion** : {"squared_error"}, optional The criterion used to evaluate the utility of a split. .. deprecated:: 1.1 Criterion "mse" was deprecated in v1.1 and removed in version 1.2. **random_state** : int or RandomState - If `int`, `random_state` is the seed used by the random number generator - If :class:`numpy.random.RandomState` instance, `random_state` is the random number generator - If `None`, the random number generator is the :class:`numpy.random.RandomState` instance used by :func:`numpy.random`. :Attributes: **tree_** : Tree The internal tree representation .. rubric:: Notes When `strategy` is set to `"best"`, the shapelet tree is constructed by selecting the top `n_shapelets` per sample. The initial construction of the matrix profile for each sample may be computationally intensive for large datasets. To balance accuracy and computational efficiency, the `sample_size` parameter can be adjusted to determine the number of samples utilized to compute the minimum distance annotation. The significance of shapelets is determined by the difference between the ab-join of a label with any other label and the self-join of the label, selecting the shapelets with the greatest absolute values. This method is detailed in the work of Zhu et al. (2020). When `strategy` is set to `"random"`, the shapelet tree is constructed by randomly sampling `n_shapelets` within the range defined by `min_shapelet_size` and `max_shapelet_size`. This method is detailed in the work of Karlsson et al. (2016). Alternatively, shapelets can be sampled with a specified `coverage_probability` and `variability`. By specifying a coverage probability, we define the probability of including a point in the extracted shapelet. If `coverage_probability` is set, `min_shapelet_size` and `max_shapelet_size` are ignored. .. rubric:: References Zhu, Y., et al. 2020. The Swiss army knife of time series data mining: ten useful things you can do with the matrix profile and ten lines of code. Data Mining and Knowledge Discovery, 34, pp.949-979. Karlsson, I., Papapetrou, P. and Boström, H., 2016. Generalized random shapelet forests. Data mining and knowledge discovery, 30, pp.1053-1085. .. only:: latex .. !! processed by numpydoc !! .. py:method:: apply(x, check_input=True) Return the index of the leaf that each sample is predicted by. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: ndarray of shape (n_samples, ) For every sample, return the index of the leaf that the sample ends up in. The index is in the range [0; node_count]. .. rubric:: Examples Get the leaf probability distribution of a prediction: >>> from wildboar.datasets import load_gun_point >>> from wildboar.tree import ShapeletTreeClassifier >>> X, y = load_gun_point() >>> tree = ShapeletTreeClassifier() >>> tree.fit(X, y) >>> leaves = tree.apply(X) >>> tree.tree_.value.take(leaves, axis=0) array([[0., 1.], [0., 1.], [1., 0.]]) This is equvivalent to using `tree.predict_proba`. .. !! processed by numpydoc !! .. py:method:: decision_path(x, check_input=True) Compute the decision path of the tree. :Parameters: **x** : array-like of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timestep) The input samples. **check_input** : bool, optional Bypass array validation. Only set to True if you are sure your data is valid. :Returns: sparse matrix of shape (n_samples, n_nodes) An indicator array where each nonzero values indicate that the sample traverses a node. .. !! processed by numpydoc !! .. py:method:: fit(x, y, sample_weight=None, check_input=True) Fit the estimator. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The training time series. **y** : array-like of shape (n_samples,) Target values as floating point values. **sample_weight** : array-like of shape (n_samples,), optional If `None`, then samples are equally weighted. Splits that would create child nodes with net zero or negative weight are ignored while searching for a split in each node. Splits are also ignored if they would result in any single class carrying a negative weight in either child node. **check_input** : bool, optional Allow to bypass several input checks. :Returns: self This object. .. !! processed by numpydoc !! .. py:method:: get_metadata_routing() Get metadata routing of this object. Please check :ref:`User Guide ` on how the routing mechanism works. :Returns: **routing** : MetadataRequest A :class:`~sklearn.utils.metadata_routing.MetadataRequest` encapsulating routing information. .. !! processed by numpydoc !! .. py:method:: get_params(deep=True) Get parameters for this estimator. :Parameters: **deep** : bool, default=True If True, will return the parameters for this estimator and contained subobjects that are estimators. :Returns: **params** : dict Parameter names mapped to their values. .. !! processed by numpydoc !! .. py:method:: predict(x, check_input=True) Predict the value of x. :Parameters: **x** : array-like of shape (n_samples, n_timesteps) The input time series. **check_input** : bool, optional Allow to bypass several input checking. Don't use this parameter unless you know what you do. :Returns: ndarray of shape (n_samples,) The predicted classes. .. !! processed by numpydoc !! .. py:method:: score(X, y, sample_weight=None) Return the coefficient of determination of the prediction. The coefficient of determination :math:`R^2` is defined as :math:`(1 - \frac{u}{v})`, where :math:`u` is the residual sum of squares ``((y_true - y_pred)** 2).sum()`` and :math:`v` is the total sum of squares ``((y_true - y_true.mean()) ** 2).sum()``. The best possible score is 1.0 and it can be negative (because the model can be arbitrarily worse). A constant model that always predicts the expected value of `y`, disregarding the input features, would get a :math:`R^2` score of 0.0. :Parameters: **X** : array-like of shape (n_samples, n_features) Test samples. For some estimators this may be a precomputed kernel matrix or a list of generic objects instead with shape ``(n_samples, n_samples_fitted)``, where ``n_samples_fitted`` is the number of samples used in the fitting for the estimator. **y** : array-like of shape (n_samples,) or (n_samples, n_outputs) True values for `X`. **sample_weight** : array-like of shape (n_samples,), default=None Sample weights. :Returns: **score** : float :math:`R^2` of ``self.predict(X)`` w.r.t. `y`. .. rubric:: Notes The :math:`R^2` score used when calling ``score`` on a regressor uses ``multioutput='uniform_average'`` from version 0.23 to keep consistent with default value of :func:`~sklearn.metrics.r2_score`. This influences the ``score`` method of all the multioutput regressors (except for :class:`~sklearn.multioutput.MultiOutputRegressor`). .. !! processed by numpydoc !! .. py:method:: set_params(**params) Set the parameters of this estimator. The method works on simple estimators as well as on nested objects (such as :class:`~sklearn.pipeline.Pipeline`). The latter have parameters of the form ``__`` so that it's possible to update each component of a nested object. :Parameters: **\*\*params** : dict Estimator parameters. :Returns: **self** : estimator instance Estimator instance. .. !! processed by numpydoc !! .. py:function:: plot_tree(clf, *, ax=None, bbox_args=dict(), arrow_args=dict(arrowstyle='<-'), max_depth=None, class_labels=True, fontsize=None, node_labeler=None) Plot a tree :Parameters: **clf** : tree-based estimator A decision tree. **ax** : axes, optional The axes to plot the tree to. **bbox_args** : dict, optional Arguments to the node box. **arrow_args** : dict, optional Arguments to the arrow. **max_depth** : int, optional Only show the branches until `max_depth`. **class_labels** : bool or array-like, optional Show the classes - if True, show classes from the `classes_` attribute of the decision tree. - if False, show leaf probabilities. - if array-like, show classes from the array. **fontsize** : int, optional The font size. If `None`, the font size is determined automatically. **node_labeler** : callable, optional A function returning the label for a node on the form `f(node) -> str)`. - If ``node.children is not None`` the node is a leaf. - ``node._attr`` contains information about the node: - ``n_node_samples``: the number of samples reaching the node - if leaf, ``value`` is an array with the fractions of labels reaching the leaf (in case of classification); or the mean among the samples reach the leaf (if regression). Determine if it is a classification or regression tree by inspecting the shape of the value array. - if branch, ``threshold`` contains the threshold used to split the node. - if branch, ``dim`` contains the dimension from which the attribute was extracted. - if branch, ``attribute`` contains the attribute used for computing the feature value. The attribute depends on the estimator. :Returns: axes The axes. .. rubric:: Examples >>> from wildboar.datasets import load_two_lead_ecg >>> from wildboar.tree import ShapeletTreeClassifier, plot_tree >>> X, y = load_two_lead_ecg() >>> clf = ShapeletTreeClassifier(strategy="random").fit(X, y) >>> plot_tree(clf) .. !! processed by numpydoc !!