`wildboar.transform._interval`#

Module Contents#

Classes#

`FeatureTransform`	Transform a time series as a number of features
`IntervalTransform`	Embed a time series as a collection of features per interval.

class wildboar.transform._interval.FeatureTransform(*, summarizer='catch22', n_jobs=None)[source]#

Bases: IntervalTransform

Transform a time series as a number of features

Parameters:

summarizer (str or list, optional) –
The method to summarize each interval.
- if str, the summarizer is determined by _SUMMARIZERS.keys().
- if list, the summarizer is a list of functions f(x) -> float, where x is a numpy array.
The default summarizer summarizes each time series using catch22-features
n_jobs (int, optional) – The number of cores to use on multi-core.

References

Lubba, Carl H., Sarab S. Sethi, Philip Knaute, Simon R. Schultz, Ben D. Fulcher, and Nick S. Jones.: catch22: Canonical time-series characteristics. Data Mining and Knowledge Discovery 33, no. 6 (2019): 1821-1852.

class wildboar.transform._interval.IntervalTransform(n_intervals='sqrt', *, intervals='fixed', sample_size=0.5, min_size=0.0, max_size=1.0, summarizer='mean_var_slope', n_jobs=None, random_state=None)[source]#

Bases: wildboar.transform.base.BaseFeatureEngineerTransform

Embed a time series as a collection of features per interval.

Examples

>>> from wildboar.datasets import load_dataset
>>> x, y = load_dataset("GunPoint")
>>> t = IntervalTransform(n_intervals=10, summarizer="mean")
>>> t.fit_transform(x)

Each interval (15 timepoints) are transformed to their mean.

>>> t = IntervalTransform(n_intervals="sqrt", summarizer=[np.mean, np.std])
>>> t.fit_transform(x)

Each interval (150 // 12 timepoints) are transformed to two features. The mean and the standard deviation.

Parameters:

n_intervals (str, int or float, optional) –
The number of intervals to use for the transform.
- if “log”, the number of intervals is log2(n_timestep).
- if “sqrt”, the number of intervals is sqrt(n_timestep).
- if int, the number of intervals is n_intervals.
- if float, the number of intervals is n_intervals * n_timestep, with 0 < n_intervals < 1.
intervals (str, optional) –
The method for selecting intervals
- if “fixed”, n_intervals non-overlapping intervals.
- if “sample”, n_intervals * sample_size non-overlapping intervals.
- if “random”, n_intervals possibly overlapping intervals of randomly sampled in [min_size * n_timestep, max_size * n_timestep]
sample_size (float, optional) – The sample size of fixed intervals if intervals="sample"
min_size (float, optional) – The minimum interval size if intervals="random"
max_size (float, optional) – The maximum interval size if intervals="random"
summarizer (str or list, optional) –
The method to summarize each interval.
- if str, the summarizer is determined by _SUMMARIZERS.keys().
- if list, the summarizer is a list of functions f(x) -> float, where x is a numpy array.
The default summarizer summarizes each interval as its mean, standard deviation and slope.
n_jobs (int, optional) – The number of cores to use on multi-core.
random_state (int or RandomState) –
- If int, random_state is the seed used by the random number generator
- If RandomState instance, random_state is the random number generator
- If None, the random number generator is the RandomState instance used by np.random.

wildboar.transform._interval#

Module Contents#

Classes#

`wildboar.transform._interval`#