wildboar.transform
#
Transform raw time series to tabular representations.
Package Contents#
Classes#
Competing Dialated Shapelet Transform. |
|
Mixin class for all transformers in scikit-learn. |
|
Mixin class for all transformers in scikit-learn. |
|
Dilated shapelet transform. |
|
Transform a time series as a number of features. |
|
A Dictionary based method using convolutional kernels. |
|
Embed a time series as a collection of features per interval. |
|
Matrix profile transform. |
|
Peicewise aggregate approximation. |
|
A transform using pivot time series and sampled distance metrics. |
|
Transform time series based on class conditional pivots. |
|
Random shapelet tranform. |
|
Transform a time series using random convolution features. |
|
Symbolic aggregate approximation. |
Functions#
|
Apply 1D convolution over a time series. |
|
Peicewise aggregate approximation. |
|
Symbolic aggregate approximation. |
- class wildboar.transform.CastorTransform(n_groups=64, n_shapelets=8, *, metric='euclidean', metric_params=None, normalize_prob=0.8, shapelet_size=11, lower=0.05, upper=0.1, soft_min=True, soft_max=False, soft_threshold=True, ignore_y=False, random_state=None, n_jobs=None)[source]#
Competing Dialated Shapelet Transform.
- Parameters:
- n_groupsint, optional
The number of groups of dilated shapelets.
- n_shapeletsint, optional
The number of dilated shapelets per group.
- metricstr or callable, optional
The distance metric
See
_METRICS.keys()
for a list of supported metrics.- metric_paramsdict, optional
Parameters to the metric.
Read more about the parameters in the User guide.
- normalize_probfloat, optional
The probability of standardizing a shapelet with zero mean and unit standard deviation.
- shapelet_sizeint, optional
The length of the dilated shapelet.
- lowerfloat, optional
The lower percentile to draw distance thresholds above.
- upperfloat, optional
The upper percentile to draw distance thresholds below.
- soft_minbool, optional
If True, use the sum of minimal distances. Otherwise, use the count of minimal distances.
- soft_maxbool, optional
If True, use the sum of maximal distances. Otherwise, use the count of maximal distances.
- soft_thresholdbool, optional
If True, count the time steps below the threshold for all shapelets. Otherwise, count the time steps below the threshold for the shapelet with the minimal distance.
- ignore_ybool, optional
Ignore y and use the same sample which a shapelet is sampled from to estimate the distance threshold.
- random_stateint or RandomState, optional
Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomState
instance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomState
instance used bynumpy.random
.
- n_jobsint, optional
The number of parallel jobs.
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.DerivativeTransform[source]#
Mixin class for all transformers in scikit-learn.
This mixin defines the following functionality:
a fit_transform method that delegates to fit and transform;
a set_output method to output X as a specific container type.
If get_feature_names_out is defined, then
BaseEstimator
will automatically wrap transform and fit_transform to follow the set_output API. See the Developer API for set_output for details.OneToOneFeatureMixin
andClassNamePrefixFeaturesOutMixin
are helpful mixins for defining get_feature_names_out.Examples
>>> import numpy as np >>> from sklearn.base import BaseEstimator, TransformerMixin >>> class MyTransformer(TransformerMixin, BaseEstimator): ... def __init__(self, *, param=1): ... self.param = param ... def fit(self, X, y=None): ... return self ... def transform(self, X): ... return np.full(shape=len(X), fill_value=self.param) >>> transformer = MyTransformer() >>> X = [[1, 2], [2, 3], [3, 4]] >>> transformer.fit_transform(X) array([1, 1, 1])
- fit_transform(X, y=None, **fit_params)[source]#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
- **fit_paramsdict
Additional fit parameters.
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
Transformed array.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.DiffTransform(order=1)[source]#
Mixin class for all transformers in scikit-learn.
This mixin defines the following functionality:
a fit_transform method that delegates to fit and transform;
a set_output method to output X as a specific container type.
If get_feature_names_out is defined, then
BaseEstimator
will automatically wrap transform and fit_transform to follow the set_output API. See the Developer API for set_output for details.OneToOneFeatureMixin
andClassNamePrefixFeaturesOutMixin
are helpful mixins for defining get_feature_names_out.Examples
>>> import numpy as np >>> from sklearn.base import BaseEstimator, TransformerMixin >>> class MyTransformer(TransformerMixin, BaseEstimator): ... def __init__(self, *, param=1): ... self.param = param ... def fit(self, X, y=None): ... return self ... def transform(self, X): ... return np.full(shape=len(X), fill_value=self.param) >>> transformer = MyTransformer() >>> X = [[1, 2], [2, 3], [3, 4]] >>> transformer.fit_transform(X) array([1, 1, 1])
- fit_transform(X, y=None, **fit_params)[source]#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
- **fit_paramsdict
Additional fit parameters.
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
Transformed array.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.DilatedShapeletTransform(n_shapelets=1000, *, metric='euclidean', metric_params=None, normalize_prob=0.5, min_shapelet_size=None, max_shapelet_size=None, shapelet_size=None, lower=0.05, upper=0.1, ignore_y=False, random_state=None, n_jobs=None)[source]#
Dilated shapelet transform.
Transform time series to a representation consisting of three values per shapelet: minimum dilated distance, the index of the timestep that minimizes the distance and number of subsequences that are below a distance threshold.
- Parameters:
- n_shapeletsint, optional
The number of dilated shapelets.
- metricstr or callable, optional
The distance metric
See
_METRICS.keys()
for a list of supported metrics.- metric_paramsdict, optional
Parameters to the metric.
Read more about the parameters in the User guide.
- normalize_probfloat, optional
The probability of standardizing a shapelet with zero mean and unit standard deviation.
- min_shapelet_sizefloat, optional
The minimum shapelet size. If None, use the discrete sizes in shapelet_size.
- max_shapelet_sizefloat, optional
The maximum shapelet size. If None, use the discrete sizes in shapelet_size.
- shapelet_sizearray-like, optional
The size of shapelets, by default [7, 9, 11].
- lowerfloat, optional
The lower percentile to draw distance thresholds above.
- upperfloat, optional
The upper percentile to draw distance thresholds below.
- ignore_ybool, optional
Ignore y and use the same sample which a shapelet is sampled from to estimate the distance threshold.
- random_stateint or RandomState, optional
Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomState
instance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomState
instance used bynumpy.random
.
- n_jobsint, optional
The number of parallel jobs.
References
- Antoine Guillaume, Christel Vrain, Elloumi Wael
Random Dilated Shapelet Transform: A New Approach for Time Series Shapelets Pattern Recognition and Artificial Intelligence, 2022
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.FeatureTransform(*, summarizer='catch22', n_jobs=None)[source]#
Transform a time series as a number of features.
- Parameters:
- summarizerstr or list, optional
The method to summarize each interval.
if str, the summarizer is determined by _SUMMARIZERS.keys().
if list, the summarizer is a list of functions f(x) -> float, where x is a numpy array.
The default summarizer summarizes each time series using catch22-features.
- n_jobsint, optional
The number of cores to use on multi-core.
Examples
>>> from wildboar.datasets import load_gun_point >>> X, y = load_gun_point() >>> X_t = FeatureTransform().fit_transform(X) >>> X_t[0] array([-5.19633603e-01, -6.51047206e-01, 1.90000000e+01, 4.80000000e+01, 7.48441896e-01, -2.73293560e-05, 2.21476510e-01, 4.70000000e+01, 4.00000000e-02, 0.00000000e+00, 2.70502518e+00, 2.60000000e+01, 6.42857143e-01, 1.00000000e-01, -3.26666667e-01, 9.89974643e-01, 2.90000000e+01, 1.31570726e+00, 1.50000000e-01, 8.50000000e-01, 4.90873852e-02, 1.47311800e-01])
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.HydraTransform(*, n_groups=64, n_kernels=8, kernel_size=9, sampling='normal', sampling_params=None, n_jobs=None, random_state=None)[source]#
A Dictionary based method using convolutional kernels.
- Parameters:
- n_groupsint, optional
The number of groups of kernels.
- n_kernelsint, optional
The number of kernels per group.
- kernel_sizeint, optional
The size of the kernel.
- sampling{“normal”}, optional
The strategy for sampling kernels. By default kernel weights are sampled from a normal distribution with zero mean and unit standard deviation.
- sampling_paramsdict, optional
Parameters to the sampling approach. The “normal” sampler accepts two parameters: mean and scale.
- n_jobsint, optional
The number of jobs to run in parallel. A value of None means using a single core and a value of -1 means using all cores. Positive integers mean the exact number of cores.
- random_stateint or RandomState, optional
Controls the random sampling of kernels.
If int, random_state is the seed used by the random number generator.
If
numpy.random.RandomState
instance, random_state is the random number generator.If None, the random number generator is the
numpy.random.RandomState
instance used bynumpy.random
.
See also
HydraClassifier
A classifier using hydra transform.
Notes
The implementation does not implement the first order descrete differences described by Dempster et. al. (2023). If this is desired, one can use native scikit-learn functionalities and the
DiffTransform
:>>> from sklearn.pipeline import make_pipeline, make_union >>> from wildboar.transform import DiffTransform, HydraTransform >>> dempster_hydra = make_union( ... HydraTransform(n_groups=32), ... make_pipeline( ... DiffTransform(), ... HydraTransform(n_groups=32) ... ) ... )
References
- Dempster, A., Schmidt, D. F., & Webb, G. I. (2023).
Hydra: competing convolutional kernels for fast and accurate time series classification. Data Mining and Knowledge Discovery
Examples
>>> from wildboar.datasets import load_gun_point >>> from wildboar.transform import HydraTransform >>> X, y = load_gun_point() >>> t = HydraTransform(n_groups=8, n_kernels=4, random_state=1) >>> t.fit_transform(X)
- Attributes:
- embedding_Embedding
The underlying embedding
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.IntervalTransform(n_intervals='sqrt', *, intervals='fixed', sample_size=0.5, min_size=0.0, max_size=1.0, summarizer='mean_var_slope', n_jobs=None, random_state=None)[source]#
Embed a time series as a collection of features per interval.
- Parameters:
- n_intervalsstr, int or float, optional
The number of intervals to use for the transform.
if “log2”, the number of intervals is log2(n_timestep).
if “sqrt”, the number of intervals is sqrt(n_timestep).
if int, the number of intervals is n_intervals.
if float, the number of intervals is n_intervals * n_timestep, with 0 < n_intervals < 1.
Deprecated since version 1.2: The option “log” has been renamed to “log2”.
- intervalsstr, optional
The method for selecting intervals.
if “fixed”, n_intervals non-overlapping intervals.
if “sample”, n_intervals * sample_size non-overlapping intervals.
if “random”, n_intervals possibly overlapping intervals of randomly sampled in [min_size * n_timestep, max_size * n_timestep].
- sample_sizefloat, optional
The sample size of fixed intervals if intervals=”sample”.
- min_sizefloat, optional
The minimum interval size if intervals=”random”.
- max_sizefloat, optional
The maximum interval size if intervals=”random”.
- summarizerstr or list, optional
The method to summarize each interval.
if str, the summarizer is determined by _SUMMARIZERS.keys().
if list, the summarizer is a list of functions f(x) -> float, where x is a numpy array.
The default summarizer summarizes each interval as its mean, standard deviation and slope.
- n_jobsint, optional
The number of cores to use on multi-core.
- random_stateint or RandomState
If int, random_state is the seed used by the random number generator
If RandomState instance, random_state is the random number generator
If None, the random number generator is the RandomState instance used by np.random.
Notes
Paralellization dependes on releasing the global interpreter lock (GIL). As such, custom functions as summarizers reduces the performance. Wildboar implements summarizers for taking the mean (“mean”), variance (“variance”) and slope (“slope”) as well as their combination (“mean_var_slope”) and the full suite of catch22 features (“catch22”). In the future, we will allow downstream projects to implement their own summarizers in Cython which will allow for releasing the GIL.
References
- Lubba, Carl H., Sarab S. Sethi, Philip Knaute, Simon R. Schultz, Ben D. Fulcher, and Nick S. Jones.
catch22: Canonical time-series characteristics. Data Mining and Knowledge Discovery 33, no. 6 (2019): 1821-1852.
Examples
>>> from wildboar.datasets import load_dataset >>> x, y = load_dataset("GunPoint") >>> t = IntervalTransform(n_intervals=10, summarizer="mean") >>> t.fit_transform(x)
Each interval (15 timepoints) are transformed to their mean.
>>> t = IntervalTransform(n_intervals="sqrt", summarizer=[np.mean, np.std]) >>> t.fit_transform(x)
Each interval (150 // 12 timepoints) are transformed to two features. The mean and the standard deviation.
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.MatrixProfileTransform(window=0.1, exclude=None, n_jobs=None)[source]#
Matrix profile transform.
Transform each time series in a dataset to its MatrixProfile similarity self-join.
- Parameters:
- windowint or float, optional
The subsequence size, by default 0.1.
if float, a fraction of n_timestep.
if int, the exact subsequence size.
- excludeint or float, optional
The size of the exclusion zone. The default exclusion zone is 0.2.
if float, expressed as a fraction of the windows size.
if int, exact size (0 < exclude).
- n_jobsint, optional
The number of jobs to use when computing the profile.
Examples
>>> from wildboar.datasets import load_two_lead_ecg() >>> from wildboar.transform import MatrixProfileTransform >>> x, y = load_two_lead_ecg() >>> t = MatrixProfileTransform() >>> t.fit_transform(x)
- fit(x, y=None)[source]#
Fit the matrix profile.
Sets the expected input dimensions.
- Parameters:
- xarray-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timesteps)
The samples.
- yignored
The optional labels.
- Returns:
- self
A fitted instance.
- fit_transform(X, y=None, **fit_params)[source]#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
- **fit_paramsdict
Additional fit parameters.
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
Transformed array.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- transform(x)[source]#
Transform the samples to their MatrixProfile self-join.
- Parameters:
- xarray-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timesteps)
The samples.
- Returns:
- ndarray of shape (n_samples, n_timestep) or (n_samples, n_dims, n_timesteps)
The matrix matrix profile of each sample.
- class wildboar.transform.PAA(n_intervals='sqrt', window=None)[source]#
Peicewise aggregate approximation.
- Parameters:
- n_intervals{“sqrt”, “log2”}, int or float, optional
The number of intervals.
- windowint, optional
The size of an interval. If window, is given then n_intervals is ignored.
- fit_transform(X, y=None, **fit_params)[source]#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
- **fit_paramsdict
Additional fit parameters.
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
Transformed array.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.PivotTransform(n_pivots=100, *, metric='auto', metric_params=None, metric_sample=None, random_state=None, n_jobs=None)[source]#
A transform using pivot time series and sampled distance metrics.
- Parameters:
- n_pivotsint, optional
The number of pivot time series.
- metric{‘auto’} or list, optional
If str, the metric to compute the distance.
- If list, multiple metrics specified as a list of tuples, where the first
element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specifiy a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification: dict(min_r=0, max_r=1, num_r=10).
Read more about the metrics and their parameters in the User guide.
- metric_paramsdict, optional
Parameters for the distance measure. Ignored unless metric is a string.
Read more about the parameters in the User guide.
- metric_sample{“uniform”, “weighted”}, optional
If multiple metrics are specified this parameter controls how they are sampled. “uniform” samples each metric configuration with equal probability and “weighted” samples each metric with equal probability. By default, metric configurations are sampled with equal probability.
- random_stateint or np.RandomState, optional
The random state.
- n_jobsint, optional
The number of cores to use.
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.ProximityTransform(n_pivots=100, metric='auto', metric_params=None, metric_sample='weighted', random_state=None, n_jobs=None)[source]#
Transform time series based on class conditional pivots.
- Parameters:
- n_pivotsint, optional
The number of pivot time series per class.
- metric{‘auto’} or list, optional
If str, the metric to compute the distance.
- If list, multiple metrics specified as a list of tuples, where the first
element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specifiy a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification: dict(min_r=0, max_r=1, num_r=10).
Read more about the metrics and their parameters in the User guide.
- metric_paramsdict, optional
Parameters for the distance measure. Ignored unless metric is a string.
Read more about the parameters in the User guide.
- metric_sample{“uniform”, “weighted”}, optional
If multiple metrics are specified this parameter controls how they are sampled. “uniform” samples each metric configuration with equal probability and “weighted” samples each metric with equal probability. By default, metric configurations are sampled with equal probability.
- random_stateint or np.RandomState, optional
The random state.
- n_jobsint, optional
The number of cores to use.
- fit_transform(X, y=None, **fit_params)[source]#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
- **fit_paramsdict
Additional fit parameters.
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
Transformed array.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.RandomShapeletTransform(n_shapelets=1000, *, metric='euclidean', metric_params=None, min_shapelet_size=0.0, max_shapelet_size=1.0, n_jobs=None, random_state=None)[source]#
Random shapelet tranform.
Transform a time series to the distances to a selection of random shapelets.
- Parameters:
- n_shapeletsint, optional
The number of shapelets in the resulting transform.
- metricstr or list, optional
If str, the distance metric used to identify the best shapelet.
- If list, multiple metrics specified as a list of tuples, where the first
element of the tuple is a metric name and the second element a dictionary with a parameter grid specification. A parameter grid specification is a dict with two mandatory and one optional key-value pairs defining the lower and upper bound on the values and number of values in the grid. For example, to specifiy a grid over the argument ‘r’ with 10 values in the range 0 to 1, we would give the following specification:
dict(min_r=0, max_r=1, num_r=10)
.
Read more about the metrics and their parameters in the User guide.
- metric_paramsdict, optional
Parameters for the distance measure. Ignored unless metric is a string.
Read more about the parameters in the User guide.
- min_shapelet_sizefloat, optional
Minimum shapelet size.
- max_shapelet_sizefloat, optional
Maximum shapelet size.
- n_jobsint, optional
The number of jobs to run in parallel. None means 1 and -1 means using all processors.
- random_stateint or RandomState, optional
If int, random_state is the seed used by the random number generator
If RandomState instance, random_state is the random number generator
- If None, the random number generator is the RandomState instance used
by np.random.
References
- Wistuba, Martin, Josif Grabocka, and Lars Schmidt-Thieme.
Ultra-fast shapelets for time series classification. arXiv preprint arXiv:1503.05018 (2015).
Examples
Transform each time series to the minimum DTW distance to each shapelet
>>> from wildboar.dataset import load_gunpoint() >>> from wildboar.transform import RandomShapeletTransform >>> t = RandomShapeletTransform(metric="dtw") >>> t.fit_transform(X)
Transform each time series to the either the minimum DTW distance, with r randomly set set between 0 and 1 or ERP distance with g between 0 and 1.
>>> t = RandomShapeletTransform( ... metric=[ ... ("dtw", dict(min_r=0.0, max_r=1.0)), ... ("erp", dict(min_g=0.0, max_g=1.0)), ... ] ... ) >>> t.fit_transform(X)
- Attributes:
- embedding_Embedding
The underlying embedding object.
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.RocketTransform(n_kernels=1000, *, sampling='normal', sampling_params=None, kernel_size=None, min_size=None, max_size=None, bias_prob=1.0, normalize_prob=1.0, padding_prob=0.5, n_jobs=None, random_state=None)[source]#
Transform a time series using random convolution features.
- Parameters:
- n_kernelsint, optional
The number of kernels to sample at each node.
- sampling{“normal”, “uniform”, “shapelet”}, optional
The sampling of convolutional filters.
if “normal”, sample filter according to a normal distribution with
mean
andscale
.if “uniform”, sample filter according to a uniform distribution with
lower
andupper
.if “shapelet”, sample filters as subsequences in the training data.
- sampling_paramsdict, optional
Parameters for the sampling strategy.
if “normal”,
{"mean": float, "scale": float}
, defaults to{"mean": 0, "scale": 1}
.if “uniform”,
{"lower": float, "upper": float}
, defaults to{"lower": -1, "upper": 1}
.
- kernel_sizearray-like, optional
The kernel size, by default
[7, 11, 13]
.- min_sizefloat, optional
The minimum timestep size used for generating kernel sizes, If set,
kernel_size
is ignored.- max_sizefloat, optional
The maximum timestep size used for generating kernel sizes, If set,
kernel_size
is ignored.- bias_probfloat, optional
The probability of using the bias term.
- normalize_probfloat, optional
The probability of performing normalization.
- padding_probfloat, optional
The probability of padding with zeros.
- n_jobsint, optional
The number of jobs to run in parallel. A value of
None
means using a single core and a value of-1
means using all cores. Positive integers mean the exact number of cores.- random_stateint or RandomState, optional
Controls the random resampling of the original dataset.
If
int
,random_state
is the seed used by the random number generator.If
numpy.random.RandomState
instance,random_state
is the random number generator.If
None
, the random number generator is thenumpy.random.RandomState
instance used bynumpy.random
.
References
- Dempster, Angus, François Petitjean, and Geoffrey I. Webb.
ROCKET: exceptionally fast and accurate time series classification using random convolutional kernels. Data Mining and Knowledge Discovery 34.5 (2020): 1454-1495.
Examples
>>> from wildboar.datasets import load_gun_point >>> from wildboar.transform import RocketTransform >>> X, y = load_gun_point() >>> t = RocketTransform(n_kernels=10, random_state=1) >>> t.fit_transform(X) array([[0.51333333, 5.11526939, 0.47333333, ..., 2.04712544, 0.24 , 0.82912261], [0.52666667, 5.26611524, 0.54 , ..., 1.98047216, 0.24 , 0.81260641], [0.54666667, 4.71210092, 0.35333333, ..., 2.28841158, 0.25333333, 0.82203705], ..., [0.54666667, 4.72938203, 0.45333333, ..., 2.53756324, 0.24666667, 0.8380654 ], [0.68666667, 3.80533684, 0.26 , ..., 2.41709413, 0.25333333, 0.65634235], [0.66 , 3.94724793, 0.32666667, ..., 1.85575661, 0.25333333, 0.67630249]])
- Attributes:
- embedding_Embedding
The underlying embedding
- fit(x, y=None)[source]#
Fit the transform.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- BaseAttributeTransform
This object.
- fit_transform(x, y=None)[source]#
Fit the embedding and return the transform of x.
- Parameters:
- xarray-like of shape (n_samples, n_timestep) or (n_samples, n_dimensions, n_timestep)
The time series dataset.
- yNone, optional
For compatibility.
- Returns:
- ndarray of shape (n_samples, n_outputs)
The embedding.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- class wildboar.transform.SAX(*, n_intervals='sqrt', window=None, n_bins=4, binning='normal', estimate=True)[source]#
Symbolic aggregate approximation.
- Parameters:
- n_intervalsstr, optional
The number of intervals to use for the transform.
if “log2”, the number of intervals is log2(n_timestep).
if “sqrt”, the number of intervals is sqrt(n_timestep).
if int, the number of intervals is n_intervals.
- if float, the number of intervals is n_intervals * n_timestep, with
0 < n_intervals < 1.
- windowint, optional
The window size. If window is set, the value of n_intervals has no effect.
- n_binsint, optional
The number of bins.
- binningstr, optional
The bin construction. By default the bins are defined according to the normal distribution. Possible values are “normal” for normally distributed bins or “uniform” for uniformly distributed bins.
- estimatebool, optional
Estimate the distribution parameters for the binning from data.
If estimate=False, it is assumed that each time series is preprocessed using:
datasets.preprocess.normalize
when binning=”normal”.datasets.preprocess.minmax_scale
. when binning=”uniform”.
- fit_transform(X, y=None, **fit_params)[source]#
Fit to data, then transform it.
Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.
- Parameters:
- Xarray-like of shape (n_samples, n_features)
Input samples.
- yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None
Target values (None for unsupervised transformations).
- **fit_paramsdict
Additional fit parameters.
- Returns:
- X_newndarray array of shape (n_samples, n_features_new)
Transformed array.
- get_metadata_routing()[source]#
Get metadata routing of this object.
Please check User Guide on how the routing mechanism works.
- Returns:
- routingMetadataRequest
A
MetadataRequest
encapsulating routing information.
- get_params(deep=True)[source]#
Get parameters for this estimator.
- Parameters:
- deepbool, default=True
If True, will return the parameters for this estimator and contained subobjects that are estimators.
- Returns:
- paramsdict
Parameter names mapped to their values.
- set_output(*, transform=None)[source]#
Set output container.
See Introducing the set_output API for an example on how to use the API.
- Parameters:
- transform{“default”, “pandas”, “polars”}, default=None
Configure output of transform and fit_transform.
“default”: Default output format of a transformer
“pandas”: DataFrame output
“polars”: Polars output
None: Transform configuration is unchanged
Added in version 1.4: “polars” option was added.
- Returns:
- selfestimator instance
Estimator instance.
- set_params(**params)[source]#
Set the parameters of this estimator.
The method works on simple estimators as well as on nested objects (such as
Pipeline
). The latter have parameters of the form<component>__<parameter>
so that it’s possible to update each component of a nested object.- Parameters:
- **paramsdict
Estimator parameters.
- Returns:
- selfestimator instance
Estimator instance.
- wildboar.transform.convolve(X, kernel, bias=0.0, *, dilation=1, stride=1, padding=0)[source]#
Apply 1D convolution over a time series.
- Parameters:
- Xarray-like of shape (n_samples, n_timestep)
The input.
- kernelarray-like of shape (kernel_size, )
The kernel.
- biasfloat, optional
The bias.
- dilationint, optional
The spacing between kernel elements.
- strideint, optional
The stride of the convolving kernel.
- paddingint, optional
Implicit padding on both sides of the input time series.
- Returns:
- ndarray of shape (n_samples, output_size)
The result of the convolution, where output_size is given by::
floor( ((X.shape[1] + 2 * padding) - (kernel.shape[0] - 1 * dilation + 1)) / stride + 1 ).
- wildboar.transform.piecewice_aggregate_approximation(x, *, n_intervals='sqrt', window=None)[source]#
Peicewise aggregate approximation.
- Parameters:
- xarray-like of shape (n_samples, n_timestep)
The input data.
- n_intervalsstr, optional
The number of intervals to use for the transform.
if “log2”, the number of intervals is
log2(n_timestep)
.if “sqrt”, the number of intervals is
sqrt(n_timestep)
.if int, the number of intervals is
n_intervals
.- if float, the number of intervals is
n_intervals * n_timestep
, with 0 < n_intervals < 1
.
- if float, the number of intervals is
- windowint, optional
The window size. If
window
is set, the value ofn_intervals
has no effect.
- Returns:
- ndarray of shape (n_samples, n_intervals)
The symbolic aggregate approximation.
- wildboar.transform.symbolic_aggregate_approximation(x, *, n_intervals='sqrt', window=None, n_bins=4, binning='normal')[source]#
Symbolic aggregate approximation.
- Parameters:
- xarray-like of shape (n_samples, n_timestep)
The input data.
- n_intervalsstr, optional
The number of intervals to use for the transform.
if “log2”, the number of intervals is
log2(n_timestep)
.if “sqrt”, the number of intervals is
sqrt(n_timestep)
.if int, the number of intervals is
n_intervals
.- if float, the number of intervals is
n_intervals * n_timestep
, with 0 < n_intervals < 1
.
- if float, the number of intervals is
- windowint, optional
The window size. If
window
is set, the value ofn_intervals
has no effect.- n_binsint, optional
The number of bins.
- binningstr, optional
The bin construction. By default the bins are defined according to the normal distribution. Possible values are
"normal"
for normally distributed bins or"uniform"
for uniformly distributed bins.
- Returns:
- ndarray of shape (n_samples, n_intervals)
The symbolic aggregate approximation.