Metrics#

Wildboar supports both subsequence distance using pairwise_subsequence_distance and traditional distances using pairwise_distance. These and related functions support different metrics, as specified by the metric argument and metric parameters using the metric_params argument.

Distance metrics are functions \(d(x, y)\) such that \(d(x, y) < d(x, z)\) when time series \(x\) and \(y\) are “more similar” than \(x\) and \(z\). In Wildboar we use the term loosely to denote any function obeying the above inequality. A true metric () must also satisfy the following:

  1. The distance to itself is always zero i.e., \(d(x, x) = 0\).

  2. The distance between two distinct points is always positive, i.e., \(d(x, y) \gt 0\), provided that \(x\ne y\).

  3. The distance is symmetrical, i.e., \(d(x, y) = d(y, x)\) which is to say that the distance between \(x\) and \(y\) is the same as the distance between \(y\) and $x$.

  4. The triangle inequality holds, \(d(x, z) \lt d(x, y) + d(y, z)\), i.e., there can be no “shortcut” through \(y\) that makes the distance shorter.

Similarity measures fall into three categories.

  1. Non-elastic true metrics such as \(Lp\)-norm that do not support time-shift.

  2. Elastic () metrics that tolerate time-shifts but are not true metrics.

  3. Elastic metrics that tolerate time-shifts, but that are true metrics.

Wildboar distinguishes between subsequence and non-subsequence metrics:

  1. Subsequence metrics compute the \(min_{t'\in t} d(s, t')\) with s.shape[-1] <= t.shape[-1] (i.e., \(s\) is shorter than \(t\)) and the notation represents “taking all subsequences in \(t\) of the same length as \(s\) and compute the distance, taking the minimum”. Subsequence metrics are never true metrics unless s.shape[-1] == t.shape[-1].

  2. Metrics compute the distance \(d(s, t)\) with s.shape[-1] == t.shape[-1] unless the metric is _elastic_. Elastic metrics support distance computations between time series of unequal length, without computing the minimum distance between equal-length subsequences.

Subsequence metrics#

Wildboar implements several subsequence metrics. The elastic subsequence metrics are, as all subsequence metrics, implemented as the minimum distance over a sliding window. As such, if we need the elastic distance between two series and not the minimum distance, we should use the non-subsequence metric. Moreover, subsequence metrics are only true metrics () if we compute the distance between time series of equal length, which makes the minimum subsequence distance equal to the distance itself.

Metric name

metric

metric_params

Comments

Euclidean

“euclidean”

{}

Normalized Euclidean

“normalized_euclidean”

{}

Euclidean distance, where length has been scaled to have unit norm. Undefined cases result in 0.

Scaled Euclidean

“scaled_euclidean” or “mass”

{}

Scales each subsequence to have zero mean and unit variance.

Manhattan

“manhattan”

{}

Minkowski

“minkowski”

{p: float}

Chebyshev

“chebyshev”

{}

Cosine

“cosine”

{}

Angular

“angular”

{}

Dynamic time warping

“dtw”

{“r”: float}

Window r in [0, 1]

Weighted DTW

“wdtw”

{“r”: float, “g”: float}

Window r in [0, 1], default 1.0. Phase difference penalty g, default 0.05.

Derivative DTW

“ddtw”

{“r”: float}

Window r in [0, 1], default 1.0.

Weighted Derivative DTW

“wddtw”

{“r”: float, “g”: float}

Window r in [0, 1], default 1.0. Phase difference penalty g, default 0.05.

Scaled DTW

“scaled_dtw”

{“r”: float}

Window r in [0, 1]

Longest common subsequence [2]

“lcss”

{r: float, epsilon: float}

Window r in [0, 1], default 1.0. Match epsilon, default 1.0.

Edit distance with real penalty [4]

“erp”

{r: float, g: float}

Window r in [0, 1], default 1.0. Gap penalty g, default 0.

Edit distance for real sequences [3]

“edr”

{r: float, epsilon: float}

Window r in [0, 1], default 1.0. Match epsilon, default 1/4*max(std(x), std(y)).

Move-split-merge [5]

“msm”

{r: float, c: float}

Window r in [0, 1], default 1.0. Split/merge cost c, default 1.

Time Warp Edit distance [6]

“twe”

{r: float, edit_penalty: float, stiffness: float}

Window r in [0, 1]. Edit penalty (\(\lambda\)), default 1. Stiffness ($nu$), default 0.001.

Elastic and non-elastic metrics#

Metric name

metric

metric_params

Comments

Euclidean

“euclidean”

{}

Normalized Euclidean

“normalized_euclidean”

{}

Euclidean distance, where length has been scaled to have unit norm. Undefined cases result in 0.

Manhattan

“manhattan”

{}

Minkowski

“minkowski”

{p: float}

Chebyshev

“chebyshev”

{}

Cosine

“cosine”

{}

Angular

“angular”

{}

Longest common subsequence [2]

“lcss”

{r: float, epsilon: float}

Window r in [0, 1], default 1. Match epsilon, default 1.

Edit distance with real penalty [4]

“erp”

{r: float, g: float}

Window r in [0, 1]. Gap penalty g, default 0.

Edit distance for real sequences [3]

“edr”

{r: float, epsilon: float}

Window r in [0, 1]. Match epsilon, default 1/4*max(std(x), std(y)).

Move-split-merge [5]

“msm”

{r: float, c: float}

Window r in [0, 1]. Split/merge cost c, default 1.

Time Warp Edit distance [6]

“twe”

{r: float, edit_penalty: float, stiffness: float}

Window r in [0, 1]. Edit penalty ($lambda$), default 1. Stiffness ($nu$), default 0.001.

Amercing dynamic time warping [1]

“dtw”

{“r”: float, “p”: float}

Window r in [0, 1]. Penalty p. Larger more penalty for warping.

Dynamic time warping

“dtw”

{“r”: float}

Window r in [0, 1].

Weighted DTW

“wdtw”

{“r”: float, “g”: float}

Window r in [0, 1]. Phase difference penalty g, default 0.05.

Derivative DTW

“ddtw”

{“r”: float}

Window r in [0, 1].

Weighted Derivative DTW

“wddtw”

{“r”: float, “g”: float}

Window r in [0, 1]. Phase difference penalty g, default 0.05.