wildboar.transform#

Transform raw time series to tabular representations.

Package Contents#

Classes#

CastorTransform

Competing Dialated Shapelet Transform.

DerivativeTransform

Mixin class for all transformers in scikit-learn.

DiffTransform

Mixin class for all transformers in scikit-learn.

DilatedShapeletTransform

Dilated shapelet transform.

FeatureTransform

Transform a time series as a number of features.

HydraTransform

A Dictionary based method using convolutional kernels.

IntervalTransform

Embed a time series as a collection of features per interval.

MatrixProfileTransform

Matrix profile transform.

PAA

Peicewise aggregate approximation.

PivotTransform

A transform using pivot time series and sampled distance metrics.

ProximityTransform

Transform time series based on class conditional pivots.

RandomShapeletTransform

Random shapelet tranform.

RocketTransform

Transform a time series using random convolution features.

SAX

Symbolic aggregate approximation.

Functions#

convolve(X, kernel[, bias, dilation, stride, padding])

Apply 1D convolution over a time series.

piecewice_aggregate_approximation(x, *[, n_intervals, ...])

Peicewise aggregate approximation.

symbolic_aggregate_approximation(x, *[, n_intervals, ...])

Symbolic aggregate approximation.

class wildboar.transform.CastorTransform(n_groups=64, n_shapelets=8, *, metric='euclidean', metric_params=None, normalize_prob=0.8, shapelet_size=11, lower=0.05, upper=0.1, soft_min=True, soft_max=False, soft_threshold=True, ignore_y=False, random_state=None, n_jobs=None)[source]#

Competing Dialated Shapelet Transform.

Parameters:
n_groupsint, optional

The number of groups of dilated shapelets.

n_shapeletsint, optional

The number of dilated shapelets per group.

metricstr or callable, optional

The distance metric

See _METRICS.keys() for a list of supported metrics.

metric_paramsdict, optional

Parameters to the metric.

Read more about the parameters in the User guide.

normalize_probfloat, optional

The probability of standardizing a shapelet with zero mean and unit standard deviation.

shapelet_sizeint, optional

The length of the dilated shapelet.

lowerfloat, optional

The lower percentile to draw distance thresholds above.

upperfloat, optional

The upper percentile to draw distance thresholds below.

soft_minbool, optional

If True, use the sum of minimal distances. Otherwise, use the count of minimal distances.

soft_maxbool, optional

If True, use the sum of maximal distances. Otherwise, use the count of maximal distances.

soft_thresholdbool, optional

If True, count the time steps below the threshold for all shapelets. Otherwise, count the time steps below the threshold for the shapelet with the minimal distance.

ignore_ybool, optional

Ignore y and use the same sample which a shapelet is sampled from to estimate the distance threshold.

random_stateint or RandomState, optional

Controls the random sampling of kernels.

  • If int, random_state is the seed used by the random number generator.

  • If numpy.random.RandomState instance, random_state is the random number generator.

  • If None, the random number generator is the numpy.random.RandomState instance used by numpy.random.

n_jobsint, optional

The number of parallel jobs.

fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.DerivativeTransform[source]#

Mixin class for all transformers in scikit-learn.

This mixin defines the following functionality:

  • a fit_transform method that delegates to fit and transform;

  • a set_output method to output X as a specific container type.

If get_feature_names_out is defined, then BaseEstimator will automatically wrap transform and fit_transform to follow the set_output API. See the Developer API for set_output for details.

OneToOneFeatureMixin and ClassNamePrefixFeaturesOutMixin are helpful mixins for defining get_feature_names_out.

Examples

>>> import numpy as np
>>> from sklearn.base import BaseEstimator, TransformerMixin
>>> class MyTransformer(TransformerMixin, BaseEstimator):
...     def __init__(self, *, param=1):
...         self.param = param
...     def fit(self, X, y=None):
...         return self
...     def transform(self, X):
...         return np.full(shape=len(X), fill_value=self.param)
>>> transformer = MyTransformer()
>>> X = [[1, 2], [2, 3], [3, 4]]
>>> transformer.fit_transform(X)
array([1, 1, 1])
fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

class wildboar.transform.DiffTransform(order=1)[source]#

Mixin class for all transformers in scikit-learn.

This mixin defines the following functionality:

  • a fit_transform method that delegates to fit and transform;

  • a set_output method to output X as a specific container type.

If get_feature_names_out is defined, then BaseEstimator will automatically wrap transform and fit_transform to follow the set_output API. See the Developer API for set_output for details.

OneToOneFeatureMixin and ClassNamePrefixFeaturesOutMixin are helpful mixins for defining get_feature_names_out.

Examples

>>> import numpy as np
>>> from sklearn.base import BaseEstimator, TransformerMixin
>>> class MyTransformer(TransformerMixin, BaseEstimator):
...     def __init__(self, *, param=1):
...         self.param = param
...     def fit(self, X, y=None):
...         return self
...     def transform(self, X):
...         return np.full(shape=len(X), fill_value=self.param)
>>> transformer = MyTransformer()
>>> X = [[1, 2], [2, 3], [3, 4]]
>>> transformer.fit_transform(X)
array([1, 1, 1])
fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

class wildboar.transform.DilatedShapeletTransform(n_shapelets=1000, *, metric='euclidean', metric_params=None, normalize_prob=0.5, min_shapelet_size=None, max_shapelet_size=None, shapelet_size=None, lower=0.05, upper=0.1, ignore_y=False, random_state=None, n_jobs=None)[source]#

Dilated shapelet transform.

Transform time series to a representation consisting of three values per shapelet: minimum dilated distance, the index of the timestep that minimizes the distance and number of subsequences that are below a distance threshold.

Parameters:
n_shapeletsint, optional

The number of dilated shapelets.

metricstr or callable, optional

The distance metric

See _METRICS.keys() for a list of supported metrics.

metric_paramsdict, optional

Parameters to the metric.

Read more about the parameters in the User guide.

normalize_probfloat, optional

The probability of standardizing a shapelet with zero mean and unit standard deviation.

min_shapelet_sizefloat, optional

The minimum shapelet size. If None, use the discrete sizes in shapelet_size.

max_shapelet_sizefloat, optional

The maximum shapelet size. If None, use the discrete sizes in shapelet_size.

shapelet_sizearray-like, optional

The size of shapelets, by default [7, 9, 11].

lowerfloat, optional

The lower percentile to draw distance thresholds above.

upperfloat, optional

The upper percentile to draw distance thresholds below.

ignore_ybool, optional

Ignore y and use the same sample which a shapelet is sampled from to estimate the distance threshold.

random_stateint or RandomState, optional

Controls the random sampling of kernels.

  • If int, random_state is the seed used by the random number generator.

  • If numpy.random.RandomState instance, random_state is the random number generator.

  • If None, the random number generator is the numpy.random.RandomState instance used by numpy.random.

n_jobsint, optional

The number of parallel jobs.

References

Antoine Guillaume, Christel Vrain, Elloumi Wael

Random Dilated Shapelet Transform: A New Approach for Time Series Shapelets Pattern Recognition and Artificial Intelligence, 2022

fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.FeatureTransform(*, summarizer='catch22', n_jobs=None)[source]#

Transform a time series as a number of features.

Parameters:
summarizerstr or list, optional

The method to summarize each interval.

  • if str, the summarizer is determined by _SUMMARIZERS.keys().

  • if list, the summarizer is a list of functions f(x) -> float, where x is a numpy array.

The default summarizer summarizes each time series using catch22-features.

n_jobsint, optional

The number of cores to use on multi-core.

Examples

>>> from wildboar.datasets import load_gun_point
>>> X, y = load_gun_point()
>>> X_t = FeatureTransform().fit_transform(X)
>>> X_t[0]
array([-5.19633603e-01, -6.51047206e-01,  1.90000000e+01,  4.80000000e+01,
        7.48441896e-01, -2.73293560e-05,  2.21476510e-01,  4.70000000e+01,
        4.00000000e-02,  0.00000000e+00,  2.70502518e+00,  2.60000000e+01,
        6.42857143e-01,  1.00000000e-01, -3.26666667e-01,  9.89974643e-01,
        2.90000000e+01,  1.31570726e+00,  1.50000000e-01,  8.50000000e-01,
        4.90873852e-02,  1.47311800e-01])
fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.HydraTransform(*, n_groups=64, n_kernels=8, kernel_size=9, sampling='normal', sampling_params=None, n_jobs=None, random_state=None)[source]#

A Dictionary based method using convolutional kernels.

Parameters:
n_groupsint, optional

The number of groups of kernels.

n_kernelsint, optional

The number of kernels per group.

kernel_sizeint, optional

The size of the kernel.

sampling{“normal”}, optional

The strategy for sampling kernels. By default kernel weights are sampled from a normal distribution with zero mean and unit standard deviation.

sampling_paramsdict, optional

Parameters to the sampling approach. The “normal” sampler accepts two parameters: mean and scale.

n_jobsint, optional

The number of jobs to run in parallel. A value of None means using a single core and a value of -1 means using all cores. Positive integers mean the exact number of cores.

random_stateint or RandomState, optional

Controls the random sampling of kernels.

  • If int, random_state is the seed used by the random number generator.

  • If numpy.random.RandomState instance, random_state is the random number generator.

  • If None, the random number generator is the numpy.random.RandomState instance used by numpy.random.

See also

HydraClassifier

A classifier using hydra transform.

Notes

The implementation does not implement the first order descrete differences described by Dempster et. al. (2023). If this is desired, one can use native scikit-learn functionalities and the DiffTransform:

>>> from sklearn.pipeline import make_pipeline, make_union
>>> from wildboar.transform import DiffTransform, HydraTransform
>>> dempster_hydra = make_union(
...     HydraTransform(n_groups=32),
...     make_pipeline(
...         DiffTransform(),
...         HydraTransform(n_groups=32)
...     )
... )

References

Dempster, A., Schmidt, D. F., & Webb, G. I. (2023).

Hydra: competing convolutional kernels for fast and accurate time series classification. Data Mining and Knowledge Discovery

Examples

>>> from wildboar.datasets import load_gun_point
>>> from wildboar.transform import HydraTransform
>>> X, y = load_gun_point()
>>> t = HydraTransform(n_groups=8, n_kernels=4, random_state=1)
>>> t.fit_transform(X)
Attributes:
embedding_Embedding

The underlying embedding

fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.IntervalTransform(n_intervals='sqrt', *, intervals='fixed', sample_size=0.5, min_size=0.0, max_size=1.0, summarizer='mean_var_slope', n_jobs=None, random_state=None)[source]#

Embed a time series as a collection of features per interval.

Parameters:
n_intervalsstr, int or float, optional

The number of intervals to use for the transform.

  • if “log2”, the number of intervals is log2(n_timestep).

  • if “sqrt”, the number of intervals is sqrt(n_timestep).

  • if int, the number of intervals is n_intervals.

  • if float, the number of intervals is n_intervals * n_timestep, with 0 < n_intervals < 1.

Deprecated since version 1.2: The option “log” has been renamed to “log2”.

intervalsstr, optional

The method for selecting intervals.

  • if “fixed”, n_intervals non-overlapping intervals.

  • if “sample”, n_intervals * sample_size non-overlapping intervals.

  • if “random”, n_intervals possibly overlapping intervals of randomly sampled in [min_size * n_timestep, max_size * n_timestep].

sample_sizefloat, optional

The sample size of fixed intervals if intervals=”sample”.

min_sizefloat, optional

The minimum interval size if intervals=”random”.

max_sizefloat, optional

The maximum interval size if intervals=”random”.

summarizerstr or list, optional

The method to summarize each interval.

  • if str, the summarizer is determined by _SUMMARIZERS.keys().

  • if list, the summarizer is a list of functions f(x) -> float, where x is a numpy array.

The default summarizer summarizes each interval as its mean, standard deviation and slope.

n_jobsint, optional

The number of cores to use on multi-core.

random_stateint or RandomState
  • If int, random_state is the seed used by the random number generator

  • If RandomState instance, random_state is the random number generator

  • If None, the random number generator is the RandomState instance used by np.random.

Notes

Paralellization dependes on releasing the global interpreter lock (GIL). As such, custom functions as summarizers reduces the performance. Wildboar implements summarizers for taking the mean (“mean”), variance (“variance”) and slope (“slope”) as well as their combination (“mean_var_slope”) and the full suite of catch22 features (“catch22”). In the future, we will allow downstream projects to implement their own summarizers in Cython which will allow for releasing the GIL.

References

Lubba, Carl H., Sarab S. Sethi, Philip Knaute, Simon R. Schultz, Ben D. Fulcher, and Nick S. Jones.

catch22: Canonical time-series characteristics. Data Mining and Knowledge Discovery 33, no. 6 (2019): 1821-1852.

Examples

>>> from wildboar.datasets import load_dataset
>>> x, y = load_dataset("GunPoint")
>>> t = IntervalTransform(n_intervals=10, summarizer="mean")
>>> t.fit_transform(x)

Each interval (15 timepoints) are transformed to their mean.

>>> t = IntervalTransform(n_intervals="sqrt", summarizer=[np.mean, np.std])
>>> t.fit_transform(x)

Each interval (150 // 12 timepoints) are transformed to two features. The mean and the standard deviation.

fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.MatrixProfileTransform(window=0.1, exclude=None, n_jobs=None)[source]#

Matrix profile transform.

Transform each time series in a dataset to its MatrixProfile similarity self-join.

Parameters:
windowint or float, optional

The subsequence size, by default 0.1.

  • if float, a fraction of n_timestep.

  • if int, the exact subsequence size.

excludeint or float, optional

The size of the exclusion zone. The default exclusion zone is 0.2.

  • if float, expressed as a fraction of the windows size.

  • if int, exact size (0 < exclude).

n_jobsint, optional

The number of jobs to use when computing the profile.

Examples

>>> from wildboar.datasets import load_two_lead_ecg()
>>> from wildboar.transform import MatrixProfileTransform
>>> x, y = load_two_lead_ecg()
>>> t = MatrixProfileTransform()
>>> t.fit_transform(x)
fit(x, y=None)[source]#

Fit the matrix profile.

Sets the expected input dimensions.

Parameters:
xarray-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timesteps)

The samples.

yignored

The optional labels.

Returns:
self

A fitted instance.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the samples to their MatrixProfile self-join.

Parameters:
xarray-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timesteps)

The samples.

Returns:
ndarray of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timesteps)

The matrix matrix profile of each sample.

class wildboar.transform.PAA(n_intervals='sqrt', window=None)[source]#

Peicewise aggregate approximation.

Parameters:
n_intervals{“sqrt”, “log2”}, int or float, optional

The number of intervals.

windowint, optional

The size of an interval. If window, is given then n_intervals is ignored.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

class wildboar.transform.PivotTransform(n_pivots=100, *, metric='auto', metric_params=None, metric_sample=None, random_state=None, n_jobs=None)[source]#

A transform using pivot time series and sampled distance metrics.

Parameters:
n_pivotsint, optional

The number of pivot time series.

metric{‘auto’} or list, optional
  • If str, the metric to compute the distance.

  • If list, multiple metrics specified as a list of tuples, where the first

    element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specifiy a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification: dict(min_r=0, max_r=1, num_r=10).

Read more about the metrics and their parameters in the User guide.

metric_paramsdict, optional

Parameters for the distance measure. Ignored unless metric is a string.

Read more about the parameters in the User guide.

metric_sample{“uniform”, “weighted”}, optional

If multiple metrics are specified this parameter controls how they are sampled. “uniform” samples each metric configuration with equal probability and “weighted” samples each metric with equal probability. By default, metric configurations are sampled with equal probability.

random_stateint or np.RandomState, optional

The random state.

n_jobsint, optional

The number of cores to use.

fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.ProximityTransform(n_pivots=100, metric='auto', metric_params=None, metric_sample='weighted', random_state=None, n_jobs=None)[source]#

Transform time series based on class conditional pivots.

Parameters:
n_pivotsint, optional

The number of pivot time series per class.

metric{‘auto’} or list, optional
  • If str, the metric to compute the distance.

  • If list, multiple metrics specified as a list of tuples, where the first

    element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specifiy a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification: dict(min_r=0, max_r=1, num_r=10).

Read more about the metrics and their parameters in the User guide.

metric_paramsdict, optional

Parameters for the distance measure. Ignored unless metric is a string.

Read more about the parameters in the User guide.

metric_sample{“uniform”, “weighted”}, optional

If multiple metrics are specified this parameter controls how they are sampled. “uniform” samples each metric configuration with equal probability and “weighted” samples each metric with equal probability. By default, metric configurations are sampled with equal probability.

random_stateint or np.RandomState, optional

The random state.

n_jobsint, optional

The number of cores to use.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

class wildboar.transform.RandomShapeletTransform(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.0, max_shapelet_size=1.0, n_jobs=None, random_state=None)[source]#

Random shapelet tranform.

Transform a time series to the distances to a selection of random shapelets.

Parameters:
n_shapeletsint, optional

The number of shapelets in the resulting transform.

metricstr or list, optional
  • If str, the distance metric used to identify the best shapelet.

  • If list, multiple metrics specified as a list of tuples, where the first

    element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specifiy a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification: dict(min_r=0, max_r=1, num_r=10).

Read more about the metrics and their parameters in the User guide.

metric_paramsdict, optional

Parameters for the distance measure. Ignored unless metric is a string.

Read more about the parameters in the User guide.

min_shapelet_sizefloat, optional

Minimum shapelet size.

max_shapelet_sizefloat, optional

Maximum shapelet size.

n_jobsint, optional

The number of jobs to run in parallel. None means 1 and -1 means using all processors.

random_stateint or RandomState, optional
  • If int, random_state is the seed used by the random number generator

  • If RandomState instance, random_state is the random number generator

  • If None, the random number generator is the RandomState instance used

    by np.random.

References

Wistuba, Martin, Josif Grabocka, and Lars Schmidt-Thieme.

Ultra-fast shapelets for time series classification. arXiv preprint arXiv:1503.05018 (2015).

Examples

Transform each time series to the minimum DTW distance to each shapelet

>>> from wildboar.dataset import load_gunpoint()
>>> from wildboar.transform import RandomShapeletTransform
>>> t = RandomShapeletTransform(metric="dtw")
>>> t.fit_transform(X)

Transform each time series to the either the minimum DTW distance, with r randomly set set between 0 and 1 or ERP distance with g between 0 and 1.

>>> t = RandomShapeletTransform(
...     metric=[
...         ("dtw", dict(min_r=0.0, max_r=1.0)),
...         ("erp", dict(min_g=0.0, max_g=1.0)),
...     ]
... )
>>> t.fit_transform(X)
Attributes:
embedding_Embedding

The underlying embedding object.

fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.RocketTransform(n_kernels=1000, *, sampling='normal', sampling_params=None, kernel_size=None, min_size=None, max_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, n_jobs=None, random_state=None)[source]#

Transform a time series using random convolution features.

Parameters:
n_kernelsint, optional

The number of kernels to sample at each node.

sampling{“normal”, “uniform”, “shapelet”}, optional

The sampling of convolutional filters.

  • if “normal”, sample filter according to a normal distribution with mean and scale.

  • if “uniform”, sample filter according to a uniform distribution with lower and upper.

  • if “shapelet”, sample filters as subsequences in the training data.

sampling_paramsdict, optional

Parameters for the sampling strategy.

  • if “normal”, {"mean": float, "scale": float}, defaults to {"mean": 0, "scale": 1}.

  • if “uniform”, {"lower": float, "upper": float}, defaults to {"lower": -1, "upper": 1}.

kernel_sizearray-like, optional

The kernel size, by default [7, 11, 13].

min_sizefloat, optional

The minimum timestep size used for generating kernel sizes, If set, kernel_size is ignored.

max_sizefloat, optional

The maximum timestep size used for generating kernel sizes, If set, kernel_size is ignored.

bias_probfloat, optional

The probability of using the bias term.

normalize_probfloat, optional

The probability of performing normalization.

padding_probfloat, optional

The probability of padding with zeros.

n_jobsint, optional

The number of jobs to run in parallel. A value of None means using a single core and a value of -1 means using all cores. Positive integers mean the exact number of cores.

random_stateint or RandomState, optional

Controls the random resampling of the original dataset.

  • If int, random_state is the seed used by the random number generator.

  • If numpy.random.RandomState instance, random_state is the random number generator.

  • If None, the random number generator is the numpy.random.RandomState instance used by numpy.random.

References

Dempster, Angus, François Petitjean, and Geoffrey I. Webb.

ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge Discovery 34.5 (2020): 1454-1495.

Examples

>>> from wildboar.datasets import load_gun_point
>>> from wildboar.transform import RocketTransform
>>> X, y = load_gun_point()
>>> t = RocketTransform(n_kernels=10, random_state=1)
>>> t.fit_transform(X)
array([[0.51333333, 5.11526939, 0.47333333, ..., 2.04712544, 0.24      ,
        0.82912261],
       [0.52666667, 5.26611524, 0.54      , ..., 1.98047216, 0.24      ,
        0.81260641],
       [0.54666667, 4.71210092, 0.35333333, ..., 2.28841158, 0.25333333,
        0.82203705],
       ...,
       [0.54666667, 4.72938203, 0.45333333, ..., 2.53756324, 0.24666667,
        0.8380654 ],
       [0.68666667, 3.80533684, 0.26      , ..., 2.41709413, 0.25333333,
        0.65634235],
       [0.66      , 3.94724793, 0.32666667, ..., 1.85575661, 0.25333333,
        0.67630249]])
Attributes:
embedding_Embedding

The underlying embedding

fit(x, y=None)[source]#

Fit the transform.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
BaseAttributeTransform

This object.

fit_transform(x, y=None)[source]#

Fit the embedding and return the transform of x.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

yNone, optional

For compatibility.

Returns:
ndarray of shape (n_samples, n_outputs)

The embedding.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(x)[source]#

Transform the dataset.

Parameters:
xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)

The time series dataset.

Returns:
ndarray of shape (n_samples, n_outputs)

The transformation.

class wildboar.transform.SAX(*, n_intervals='sqrt', window=None, n_bins=4, binning='normal', estimate=True)[source]#

Symbolic aggregate approximation.

Parameters:
n_intervalsstr, optional

The number of intervals to use for the transform.

  • if “log2”, the number of intervals is log2(n_timestep).

  • if “sqrt”, the number of intervals is sqrt(n_timestep).

  • if int, the number of intervals is n_intervals.

  • if float, the number of intervals is n_intervals * n_timestep, with

    0 < n_intervals < 1.

windowint, optional

The window size. If window is set, the value of n_intervals has no effect.

n_binsint, optional

The number of bins.

binningstr, optional

The bin construction. By default the bins are defined according to the normal distribution. Possible values are “normal” for normally distributed bins or “uniform” for uniformly distributed bins.

estimatebool, optional

Estimate the distribution parameters for the binning from data.

If estimate=False, it is assumed that each time series is preprocessed using:

  • datasets.preprocess.normalize when binning=”normal”.

  • datasets.preprocess.minmax_scale. when binning=”uniform”.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

wildboar.transform.convolve(X, kernel, bias=0.0, *, dilation=1, stride=1, padding=0)[source]#

Apply 1D convolution over a time series.

Parameters:
Xarray-like of shape (n_samples, n_timestep)

The input.

kernelarray-like of shape (kernel_size, )

The kernel.

biasfloat, optional

The bias.

dilationint, optional

The spacing between kernel elements.

strideint, optional

The stride of the convolving kernel.

paddingint, optional

Implicit padding on both sides of the input time series.

Returns:
ndarray of shape (n_samples, output_size)

The result of the convolution, where output_size is given by::

floor(
    ((X.shape[1] + 2 * padding) - (kernel.shape[0] - 1 * dilation + 1)) / stride
    + 1
).
wildboar.transform.piecewice_aggregate_approximation(x, *, n_intervals='sqrt', window=None)[source]#

Peicewise aggregate approximation.

Parameters:
xarray-like of shape (n_samples, n_timestep)

The input data.

n_intervalsstr, optional

The number of intervals to use for the transform.

  • if “log2”, the number of intervals is log2(n_timestep).

  • if “sqrt”, the number of intervals is sqrt(n_timestep).

  • if int, the number of intervals is n_intervals.

  • if float, the number of intervals is n_intervals * n_timestep, with

    0 < n_intervals < 1.

windowint, optional

The window size. If window is set, the value of n_intervals has no effect.

Returns:
ndarray of shape (n_samples, n_intervals)

The symbolic aggregate approximation.

wildboar.transform.symbolic_aggregate_approximation(x, *, n_intervals='sqrt', window=None, n_bins=4, binning='normal')[source]#

Symbolic aggregate approximation.

Parameters:
xarray-like of shape (n_samples, n_timestep)

The input data.

n_intervalsstr, optional

The number of intervals to use for the transform.

  • if “log2”, the number of intervals is log2(n_timestep).

  • if “sqrt”, the number of intervals is sqrt(n_timestep).

  • if int, the number of intervals is n_intervals.

  • if float, the number of intervals is n_intervals * n_timestep, with

    0 < n_intervals < 1.

windowint, optional

The window size. If window is set, the value of n_intervals has no effect.

n_binsint, optional

The number of bins.

binningstr, optional

The bin construction. By default the bins are defined according to the normal distribution. Possible values are "normal" for normally distributed bins or "uniform" for uniformly distributed bins.

Returns:
ndarray of shape (n_samples, n_intervals)

The symbolic aggregate approximation.