What’s new#
Feature: something that you couldn’t do before.
Efficiency: an existing feature now may not require as much computation or memory.
Enhancement: a miscellaneous minor improvement.
Fix: something that previously didn’t work as documentated
API: you will need to change your code to have the same effect in the future; or a feature will be removed in the future.
Dependencies#
Wildboar 1.2 requires Python 3.8+, numpy 1.17.3+, scipy 1.3.2+ and scikit-learn 1.3+.
Version 1.3.0#
In development
New and changed models#
Wildboar 1.2 introduces several new models.
explain.counterfactual.NativeGuideCounterfactual
a baseline counterfactual explainer as proposed by Delaney et al. (2021).Adds a new module
wildboar.dimension_selection
to sample a subset of the most important dimensions when considering mult-variate time series. The new module contains four new selector algorithms:DistanceVarianceThreshold
: remove dimensions where the pairwise distances has a variance below the threshold.SequentialDimensionSelector
: remove dimensions sequentially by adding (or removing) dimensions greedily.SelectDimensionPercentile
: only retain the specified fraction of dimensions with the highest score.SelectDimensionTopK
: only retain the specified top k dimensions with the highest score.SelectDimensionPercentile
: only retain the dimensions with a p-value below the specified alpha level.ECSSelector
: Select time series dimensions based on the sum of distances between pairs of classes.
Add a new transform
QuantTransform
as proposed by Dempster et.al (2024).Add a new transform
FftTransform
for discrete Fourier transform.Add support for scikit-learn style preprocessing transformers:
Standardize
,MinMaxScale
,MaxAbsScale
,Truncate
,Interpolate
.Add a new module
lb
for distance lower bounds. The module contains four lower bounds.DtwKeoghLowerBound
: lower bounds dynamic time warping.DtwKimLowerBound
: lower bounds dynamic time warping.SaxLowerBound
: lower bounds z-normalized Euclidean distance.PaaLowerBound
: lower bounds z-normalized Euclidean distance.
Feature Native guide counterfactuals.
Fix Increase the timeout upon first use. We also issue a better error message to alert the user to use
refresh_repositories
to refresh the repositories.Fix Avoid division by zero in
standardize
.Feature Add support for interpolating missing values. Support is implemented for both
load_dataset
(withpreprocess="interpolate"
) and as a scikit-learn compatible transformerInterpolate
Feature Add support for scikit-learn compatible transformers for standardize, minmax_scale, maxabs_scale and truncate.
API Rename matrix_profile to paired_matrix_profile and issue a deprecation warning in matrix_profile. The new function reverses the meaning of X and Y, i.e., annotate every subsequence in X with the closest match in Y (instead of the reverse).
Feature A new function
matrix_profile
for computing the matrix profile for every subsequence in all time series. By default it will raise a deprecation warning and delegate to paired_matrix_profile (until 1.4), after which the kind=”default” will be the default value. To keep the current behaviour set kind=”paired” and swap the order of X and Y or usepaired_matrix_profile
.
Enhancement Add parameter validation to all models in the
wildboar.linear_model
module.
Feature Add a new hyper-parameter impurity_equality_tolerance which controls how we treat impurities as equal. If the impurity of two shapelets are the same we consider the separation gap. By default the distance separation gap is disabled (
impurity_equality_tolerance=None
) but it can be enabled by setting a (small) non negative float.Feature Add support for plotting decision trees using the
plot_tree
function.Feature Add support for different strategies when constructing shapelet trees. When strategy=”best”, we use the matrix profile to find the best shapelets per sample in the sizes determined by the shapelet_size parameter. We can tune the trade-off between accuracy and computational cost by setting the sample_size parameter. The tree defaults to strategy=”random” to retain backward compatibility. The default value will change to strategy=”best” in 1.4 and we issue a deprecation warning.
Feature Add support for specifying a coverage probability instead of a shapelet size range using coverage_probability and variability.
Enhancement Ensure that we consider the number of dimensions when computing the default value for n_shapelets.
API Deprecate the “sample” argument for intervals in interval-based transformations. To sub-sample intervals, set sample_size to a float.
API Deprecate
RandomShapeletTransform
which will be removed in 1.4. UseShapeletTransform
with strategy=”random” to keep the current behavior after 1.4.Feature Add a new class
ShapeletTransform
that accept an additional parameter strategy which can be set to “random” or “best”. If set to “best” we use the matrix profile to find the best shapelets per sample to use in the transformation. The shapelet size is determined by the shapelet_size parameter.Feature Add a new option to
DerivativeTransform
method
to select the way in which the difference is computed. Supported options include: backward, central, and slope.API Deprecate the
estimate
parameter inSAX
. The same behavior can be had usingscale=True
(default).Enhancement Add support for dyadic intervals in
IntervalTransform
.Enhancement Add support for quantile summarizer in
IntervalTransform
.Enhancement Add support for summarizer parameters in
IntervalTransform
.Feature Add support for specifying a coverage probability instead of a minimum and maximum interval size in
IntervalTransform
when using intervals=”random”.