Time series#
A time series is a (temporally) ordered sequence of real values. A univariate
time series has a single dimension, whereas a multivariate time series has
multiple dimensions. In Wildboar, time series are represented by Numpy-arrays.
A single univariate time series is represented as an array of shape
(1, n_timestep)
(or (n_timestep, )
). Wildboar represents a
multivariate time series as an array of shape (1, n_dims, n_timestep)
(or (n_dims, n_timestep)
depending on context). A dataset of time
series is an array of n_samples
, i.e., for univariate time series an array
of shape (n_samples, n_timestep)
(or (n_samples, 1,
n_timestep)
) and a multivarete time series is represented as an array of shape
(n_samples, n_dims, n_timestep)
.
Most algorithms in wildboar assumes that the time series are of equal length
and without missing values. However, some datasets contain both missing values
and/or have time series or dimensions of unequal length. Wildboar represents
the End-of-Sequence identifier as EOS
and
value is missing by nupy.nan
. The EoS
value is a valid IEEE754
NaN
value, and will be treated as True
by numpy.isnan
,
whereas is_end_of_series
will treat
numpy.nan
as False
.
Note
By having EoS
treated as NaN
, we can ignore it and just treat them
as missing values.
import numpy as np
from wildboar.utils.variable_len import EOS
t1 = np.array([1, 2, 3, 1, 1, 1], dtype=float)
t2 = np.array([1, 2, 3, 1, EOS, EOS], dtype=float)
t3 = np.array([1, np.nan, 3, 3, 3, 3], dtype=float)
x = np.vstack([t1, t2, t3]) # dataset of (3, 6)
x
array([[ 1., 2., 3., 1., 1., 1.],
[ 1., 2., 3., 1., nan, nan],
[ 1., nan, 3., 3., 3., 3.]])
In the example, we construct a dataset comprising 3 samples, each containing 6 timesteps, where:
x[0]
contains no missing values and has a sequence length equal ton_timestep
.x[1]
also contains no missing values but has a sequence length of 4.x[2]
includes one missing value at the second index position.
Variable length time series#
Warning
Support for variable length time series is not stable and the API will change