What’s new#
Feature: something that you couldn’t do before.
Efficiency: an existing feature now may not require as much computation or memory.
Enhancement: a miscellaneous minor improvement.
Fix: something that previously didn’t work as documentated
API: you will need to change your code to have the same effect in the future; or a feature will be removed in the future.
Dependencies#
Wildboar 1.2 requires Python 3.8+, numpy 1.19.5+, scipy 1.6.0+ and scikit-learn 1.3+.
Version 1.2.0#
New and changed models#
Wildboar 1.2 introduces several new models.
transform.HydraTransform
: a new convolution based dictionary transformation method as described by Dempster et al., (2023).linear_model.HydraClassifier
: a new convolution based dictionary classifier as described by Dempster et al., (2023).distance.KNeighborsClassifier
: the traditional k-neighbors classifier using wildboar native distance metrics, including the full suite of optimized elastic metrics.distance.KMeans
: the traditional k-means clustering algorithm. Compared to scikit-learn, this implementation supports dtw and wdtw.distance.KMedoids
: the traditional k-medoids clustering algorithm with support for all elastic metrics.ensemble.ElasticEnsembleClassifier
: the elastic ensemble classifier as described by Lines and Bagnall (2015).transform.DilatedShapeletTransform
: a new shapelet based transform as described by Guillaume et al., (2022).linear_model.DilatedShapeletClassifier
: a new shapelet based classifier as described by Guillaume et al., (2022).transform.CastorTransform
: a new shapelet based transform using competing shapelets introduced in Samsten and Lee (2024).linear_model.CastorClassifier
: a new shapelet based classifier using competing shapelets introduced in Samsten and Lee (2024).
Changelog#
API Drop support for specifying a dataset version in
load_dataset
. Support was dropped to ensure consistency between the repository declaration of arrays and what is available in the downloaded bundle.Fix Correctly detect duplicate repositories.
Fix Defer repository refresh to first use.
Enhancement Improve support for 3darrays in
distance.pairwise_distance
anddistance.paired_distance
. By settingdim='mean'
, the mean distance over all dimensions are computed and by settingdim='full'
the distance over all dimensions are returned. The default value fordim
will change to “mean” in 1.3. For 3darrays, we issue a deprecation warning for the current default value.Enhancement Add support for standardizing all subsequence metrics. Prefix the name of the metric with
"scaled_"
to use the standardized metric, e.g., “scaled_euclidean” or “scaled_msm”.Enhancement Add support for callable metrics. To support standardizing, we introduce new keyword parameter that pairs with the “metric” parameter called scale that, if set to True scale all subsequences before applying the metric. The effect of setting scale=True is the same as passing a scaled metric, e.g., “scaled_euclidean”.
Feature A new function
distance.argmin_distance
which takes as input two arrays X and Y and finds, for each sample in X, the indices of the k samples in Y with the smallest distance to the i:th sample in X.Feature A new function
distance.distance_profile
which takes a subsequence Y and a (collection) of time series X and returns the distance from Y to all subsequences of the same length of the i:th sample in X, with support for dilation and padding.Feature A new function
distance.argmin_subsequence_distance
which takes two paired arrays of subsequences and samples and finds the k smallest matching positions for each sample/subsequence pair.Feature Enables support for
n_jobs
indistance.pairwise_distance
,distance.paired_distance
.Feature Add support for Amercing Dynamic Time Warping (subsequence) distance.
Feature Add support for LCSS subsequence distance.
Feature Add support for EDR subsequence distance.
Feature Add support for TWE subsequence distance.
Feature Add support for MSM subsequence distance.
Feature Add support for ERP subsequence distance.
Fix Fix a bug in angular distance leading to
NaN
values.Fix Fix a bug in angular distance subsequence matching where an incorrect threshold was set.
Fix Fix the return value of
distance.paired_distance
to (n_dims, n_samples) when dim=”full”.API Rename LCSS
threshold
parameter toepsilon
. We will removethreshold
in 1.4.API Rename EDR
threshold
parameter toepsilon
. We will removethreshold
in 1.4.API Rename
_distance.DistanceMeasure
toMetric
and_distance.SubsequenceDistanceMeasure
toSubsequenceMetric
. The change only affect code thatcimport
modules.API The default value of threshold in
distance.subsequence_match
has changed to “auto”. The old value “best” has been deprecated and will be removed in 1.3.
Feature Add support for multiple metrics in
ensemble.ShapeletForestClassifier
,ensemble.ShapeletForestRegressor
. All estimators with ametric
parameter and which implements theShapeletMixin
are affected by this change.API Rename the constructor parameter
base_estimator
toestimator
inensemble.BaggingClassifier
andensemble.BaggingRegressor
.base_estimator
is deprecated in 1.2 and will be removed in 1.4.API Change the tuple argument for
kernel_size
to two new parametersmin_size
andmax_size
. This change affecttree.RocketForestClassifier
andtree.RocketForestRegressor
.Fix Fix a bug where
sampling
was incorrectly set forensemble.RocketForestClassifier
andensemble.RocketForestRegressor
.API Change the default value of
n_shapelets
to “log2” forensemble.ShapeletForestClassifier
andensemble.ShapeletForestRegressor
.API Drop support for
criterion="mse"
inensemble.ShapeletForestRegressor
andensemble.ExtraShapeletTreesRegressor
.
Feature Add support for KNeighborsClassifiers fitted with any metric in
explain.counterfactual.KNeighborsCounterfactual
. We allow for using different methods for finding the counterfactuals for n_neighbors > 1 by setting method=’mean’ or method=’medoid’. We have also improved the way in which cluster centroids are selected, resulting in a more robust counterfactuals.
API Undeprecate the
normalize
parameter fromlinear_model.RocketClassifier
andlinear_model.RocketRegressor
.
Feature Add support for multiple metrics in
transform.RandomShapeletTransform
by passing a list of metric specifications. See the documentation for details.Enhancement Rename the parameter value
log
for the parametern_intervals
intransform.IntervalTransform
tolog2
. The old value is deprecated and will be removed in 1.4.Feature Improve the
metric
specification fortransform.PivotTransform
.API Change the tuple argument for
kernel_size
to two new parametersmin_size
andmax_size
. This change affecttransform.RocketTransform
.
Feature Add support for multiple metrics in
tree.ShapeletTreeClassifier
,tree.ShapeletTreeRegressor
. All estimators with ametric
parameter and which implements theShapeletMixin
is affected by this change.Fix Correctly use MSM distance measure in
tree.ProximityTreeClassifier
.Fix Correctly set
min_samples_leaf
intree.RocketTreeClassifier
andRocketTreeRegressor
.API Change the tuple argument for
kernel_size
to two new parametersmin_size
andmax_size
. This change affecttree.RocketTreeClassifier
andtree.RocketTreeRegressor
.API The
metric_factories
parameter oftree.ProximityTreeClassifier
has been renamed tometric
. We have deprecatedmetric_factories
and it will be removed in 1.4. We also introduce themetric_params
argument for single metric uses.API Change the default value of
n_shapelets
to “log2” fortree.ShapeletTreeClassifier
andtree.ShapeletTreeRegressor
.API Drop support for
criterion="mse"
intree.ShapeletTreeRegressor
andtree.ExtraShapeletTreeRegressor
.
Other improvements#
Remove all dependencies on deprecated Numpy APIs.
Migrate to the new scikit-learn parameter validation framework.