wildboar.distance.lb#

Lower bounds for distance metrics.

Classes#

DtwKeoghLowerBound

Lower bound for dynamic time warping.

DtwKimLowerBound

Lower bound for Dynamic time warping computed in constant time.

PaaLowerBound

Lower bound for the Euclidean distance between z-normalized time series.

SaxLowerBound

Lower bound for the Euclidean distance between z-normalized time series.


class wildboar.distance.lb.DtwKeoghLowerBound(r=1.0, *, kind='both')[source]#

Lower bound for dynamic time warping.

Implements the LB_Keogh algorithm for efficient similarity search in time series data. This method approximates distances between sequences by comparing their upper and lower bounds, enhancing performance by reducing computational overhead.

Parameters:
rfloat

The warp window for DTW.

kind{“both”, “left”, “right”}
  • If “both”, compute the bound for both sides and take the maximum.

  • If “left”, compute the bound only for the query.

  • If “right”, compute the bound only for the data.

Examples

>>> from wildboar.datasets import load_gun_point
>>> from wildboar.distance import argmin_distance
>>> from wildboar.distance.lb import DtwKeoghLowerBound
>>> X, y = load_gun_point()
>>> lbkeogh = DtwKeoghLowerBound(r=0.1).fit(X[30:])
>>> argmin_distance(
...     X[:30],  # query
...     X[30:],  # database
...     metric="dtw",
...     metric_params={"r": 0.1},
...     lower_bound=lbkeogh.transform(X[:30])
... )
fit(X, y=None)[source]#

Fit the lower bound for time series.

Parameters:
Xarray-like of shape (n_samples, n_timestep)

The time series to query.

yignored, optional

For API compatibility.

Returns:
self

The estimator.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(X)[source]#

Fit the lower bound for time series.

Parameters:
Xarray-like of shape (n_samples, n_timestep)

The time series to query.

Returns:
self

The estimator.

class wildboar.distance.lb.DtwKimLowerBound[source]#

Lower bound for Dynamic time warping computed in constant time.

The bound is very fast to compute but ineffective.

fit(X, y=None)[source]#

Fit the lower bound for time series.

Parameters:
Xarray-like of shape (n_samples, n_timestep)

The time series to query.

yignored, optional

For API compatibility.

Returns:
self

The estimator.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(X)[source]#

Compute lower bound for query.

Parameters:
Xarray-like of shape (n_queries, n_timesteps)

The query.

Returns:
ndarray of shape (n_queries, n_samples)

The lower bound of the distance between the i:th query and the j:th in database sample.

class wildboar.distance.lb.PaaLowerBound(*, window=None, n_intervals='sqrt')[source]#

Lower bound for the Euclidean distance between z-normalized time series.

The lower bound is computed based on PAA.

Parameters:
windowint, optional

The size of an interval. If window, is given then n_intervals is ignored.

n_intervals{“sqrt”, “log2”}, int or float, optional

The number of intervals.

Examples

>>> from wildboar.datasets import load_gun_point
>>> from wildboar.distance import argmin_distance
>>> from wildboar.distance.lb import PaaLowerBound
>>> X, y = load_gun_point()
>>> lbpaa = PaaLowerBound().fit(X[30:])
>>> argmin_distance(X[:30], X[30:], lower_bound=lbpaa.transform(X[:30]))
fit(X, y=None)[source]#

Fit the lower bound for time series.

Parameters:
Xarray-like of shape (n_samples, n_timestep)

The time series to query.

yignored, optional

For API compatibility.

Returns:
self

The estimator.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(X)[source]#

Compute lower bound for query.

Parameters:
Xarray-like of shape (n_queries, n_timesteps)

The query.

Returns:
ndarray of shape (n_queries, n_samples)

The lower bound of the distance between the i:th query and the j:th in database sample.

class wildboar.distance.lb.SaxLowerBound(*, window=None, n_intervals='sqrt', n_bins=10)[source]#

Lower bound for the Euclidean distance between z-normalized time series.

The lower bound is computed based on SAX.

Parameters:
windowint, optional

The size of an interval. If window, is given then n_intervals is ignored.

n_intervals{“sqrt”, “log2”}, int or float, optional

The number of intervals.

n_binsint, optional

The number of bins.

Examples

>>> from wildboar.datasets import load_gun_point
>>> from wildboar.distance import argmin_distance
>>> from wildboar.distance.lb import SaxLowerBound
>>> X, y = load_gun_point()
>>> lbsax = SaxLowerBound(n_bins=20).fit(X[30:])
>>> argmin_distance(X[:30], X[30:], lower_bound=lbsax.transform(X[:30]))
fit(X, y=None)[source]#

Fit the lower bound for time series.

Parameters:
Xarray-like of shape (n_samples, n_timestep)

The time series to query.

yignored, optional

For API compatibility.

Returns:
self

The estimator.

fit_transform(X, y=None, **fit_params)[source]#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
Xarray-like of shape (n_samples, n_features)

Input samples.

yarray-like of shape (n_samples,) or (n_samples, n_outputs), default=None

Target values (None for unsupervised transformations).

**fit_paramsdict

Additional fit parameters.

Returns:
X_newndarray array of shape (n_samples, n_features_new)

Transformed array.

get_metadata_routing()[source]#

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:
routingMetadataRequest

A MetadataRequest encapsulating routing information.

get_params(deep=True)[source]#

Get parameters for this estimator.

Parameters:
deepbool, default=True

If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:
paramsdict

Parameter names mapped to their values.

set_output(*, transform=None)[source]#

Set output container.

See Introducing the set_output API for an example on how to use the API.

Parameters:
transform{“default”, “pandas”, “polars”}, default=None

Configure output of transform and fit_transform.

  • “default”: Default output format of a transformer

  • “pandas”: DataFrame output

  • “polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:
selfestimator instance

Estimator instance.

set_params(**params)[source]#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:
**paramsdict

Estimator parameters.

Returns:
selfestimator instance

Estimator instance.

transform(X)[source]#

Compute lower bound for query.

Parameters:
Xarray-like of shape (n_queries, n_timesteps)

The query.

Returns:
ndarray of shape (n_queries, n_samples)

The lower bound of the distance between the i:th query and the j:th in database sample.