Symbolic aggregate approximation#

In this example, we will explore the SAX transformation. Which discretizes a time series into bins and assign those bins a letter.

[1]:
import numpy as np
import matplotlib.pylab as plt

from wildboar.datasets import load_dataset
from wildboar.transform import SAX
from wildboar.utils.plot import plot_time_domain

First, we load a dataset. Note that we preprocess the dataset using the minmax_scale function so that each time series no longer is normalized with zero mean and unit standard deviation.

[2]:
x, y = load_dataset("TwoLeadECG", preprocess="minmax_scale")

Next, we define a SAX transformer object. Since SAX is stateless, we do not have to call fit on the estimator. The SAX estimator implicitly sets the estimate parameter to true so that the parameters for SAX are estimated for each sample.

[3]:
transform = SAX(window=4, n_bins=4)
[4]:
x_sax = transform.fit_transform(x)
[5]:
fig, ax = plt.subplots(nrows=2, figsize=(8, 6))
ax[0].plot(np.repeat(x_sax[0], 4, axis=0))
ax[1].plot(x[0])
[5]:
[<matplotlib.lines.Line2D at 0x7f80f8a684f0>]
../../../_images/examples_notebooks_transform_sax_7_1.png

If we set estimate=False, the results are incorrect unless we standardize the time series.

[6]:
transform.set_params(estimate=False)
x_sax = transform.fit_transform(x)
[7]:
fig, ax = plt.subplots(nrows=2, figsize=(8, 6))
ax[0].plot(np.repeat(x_sax[0], 4, axis=0))
ax[1].plot(x[0])
[7]:
[<matplotlib.lines.Line2D at 0x7f80c8e12b20>]
../../../_images/examples_notebooks_transform_sax_10_1.png

However, since the time series are min/max normalized, we can safely use the binning="uniform" without setting estimate=True.

[8]:
transform.set_params(binning="uniform")
x_sax = transform.fit_transform(x)
[9]:
fig, ax = plt.subplots(nrows=2, figsize=(8, 6))
ax[0].plot(np.repeat(x_sax[0], 4, axis=0))
ax[1].plot(x[0])
[9]:
[<matplotlib.lines.Line2D at 0x7f80b8536850>]
../../../_images/examples_notebooks_transform_sax_13_1.png