**************************
:py:mod:`wildboar.metrics`
**************************
.. py:module:: wildboar.metrics
.. autoapi-nested-parse::
Evaluation metrics.
..
!! processed by numpydoc !!
Functions
---------
.. autoapisummary::
wildboar.metrics.compactness_score
wildboar.metrics.plausability_score
wildboar.metrics.proximity_score
wildboar.metrics.redudancy_score
wildboar.metrics.relative_proximity_score
wildboar.metrics.silhouette_samples
wildboar.metrics.silhouette_score
wildboar.metrics.validity_score
.. raw:: html
.. py:function:: compactness_score(x_factual, x_counterfactual, *, window=None, n_bins=None, atol=1e-08, average=True)
Compute compactness score.
The compactness of counterfactuals as measured by the fraction of changed
timesteps. The fewer timesteps have changed between the original and the
counterfactual, the lower the score.
:Parameters:
**x_factual** : array-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timeteps)
The true samples.
**x_counterfactual** : array-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timeteps)
The counterfactual samples.
**window** : int, optional
If set, evaluate the difference between windows of specified size.
**n_bins** : int, optional
If set, evaluate the set overlap of SAX transformed series.
**atol** : float, optional
The absolute tolerance.
**average** : bool, optional
Compute average score over all dimensions.
:Returns:
float
The compactness score. Lower score indicates more compact counterfactuals.
.. rubric:: Notes
The samples in `x_counterfactual` and `x_factual` should be aligned such
that the i:th counterfacutal sample is derived from the i:th factual sample.
.. rubric:: References
Karlsson, I., Rebane, J., Papapetrou, P., & Gionis, A. (2020).
Locally and globally explainable time series tweaking.
Knowledge and Information Systems, 62(5), 1671-1700.
.. only:: latex
..
!! processed by numpydoc !!
.. py:function:: plausability_score(x_plausible, x_counterfactuals, *, y_plausible=None, y_counterfactual=None, estimator=None, method='accuracy', average=True)
Compute plausibility score.
:Parameters:
**x_plausible** : array-like of shape (n_samples, n_timesteps)
The plausible samples, typically the training or testing samples.
**x_counterfactuals** : array-like of shape (m_samples, n_timesteps)
The counterfactual samples.
**y_plausible** : array-like of shape (n_samples, ), optional
The labels of the plausible samples.
**y_counterfactual** : array-like of shape (m_samples, ), optional
The desired label of the counterfactuals.
**estimator** : estimator, optional
The outlier estimator, must implement `fit` and `predict`. If None,
we use LocalOutlierFactor.
- if score="mean", the estimator must also implement `decision_function`.
**method** : {'score', 'accuracy'}, optional
The score function.
**average** : bool, optional
If True, return the average score for all labels in y_counterfactual;
otherwise, return the score for the individual labels (ordered as np.unique).
:Returns:
ndarray or float
The plausability.
- if method='scores', the mean score is returned, with larger score incicating
better performance.
- if method='accuracy', the fraction of plausible counterfactuals are returned.
- if y_counterfactual is None and average=False, the scores or accuracy for each
counterfactual label is returned.
.. rubric:: References
Delaney, E., Greene, D., & Keane, M. T. (2020).
Instance-based Counterfactual Explanations for Time Series Classification.
arXiv, 2009.13211v2.
.. only:: latex
..
!! processed by numpydoc !!
.. py:function:: proximity_score(x_factual, x_counterfactual, metric='normalized_euclidean', metric_params=None)
Compute proximity score.
The closer the counterfactual is to the original, the lower the score.
:Parameters:
**x_factual** : array-like of shape (n_samples, n_timestep)
The true samples.
**x_counterfactual** : array-like of shape (n_samples, n_timestep)
The counterfactual samples.
**metric** : str or callable, optional
The distance metric
See ``_METRICS.keys()`` for a list of supported metrics.
**metric_params** : dict, optional
Parameters to the metric.
Read more about the parameters in the
:ref:`User guide `.
:Returns:
float
The mean proximity.
.. rubric:: Notes
The samples in `x_counterfactual` and `x_factual` should be aligned such
that the i:th counterfacutal sample is derived from the i:th factual sample.
.. rubric:: References
Delaney, E., Greene, D., & Keane, M. T. (2020).
Instance-based Counterfactual Explanations for Time Series Classification.
arXiv, 2009.13211v2.
Karlsson, I., Rebane, J., Papapetrou, P., & Gionis, A. (2020).
Locally and globally explainable time series tweaking.
Knowledge and Information Systems, 62(5), 1671-1700.
.. only:: latex
..
!! processed by numpydoc !!
.. py:function:: redudancy_score(estimator, x_factual, x_counterfactual, y_counterfactual, *, n_intervals='sqrt', window=None, average=True)
Compute the redudancy score.
Redundancy is measure of how much impact non-overlapping intervals has
in the construction of the counterfactuals.
:Parameters:
**estimator** : Estimator
The estimator counterfactuals are computed for.
**x_factual** : array-like of shape (n_samples, n_timestep)
The factual samples, i.e., samples for which counterfactuals
are computed.
**x_counterfactual** : array-like of shape (n_samples, n_timestep)
The counterfactual samples.
**y_counterfactual** : array-like of shape (n_samples, )
The desired counterfactual label.
**n_intervals** : {"sqrt", "log2"}, int or float, optional
The number of intervals.
**window** : int, optional
The size of an interval. If set, `n_intervals` is ignored.
**average** : bool, optional
Return the average redundancy over all intervals.
:Returns:
ndarray of shape (n_intervals, ) or float
The redundancy of each interval, expressed as the fraction
of samples that have the same label if the interval is replaced
with the corresponding interval of the factual sample. If `average`
is True, return a single float.
.. rubric:: Notes
The samples in `x_counterfactual` and `x_factual` should be aligned such
that the i:th counterfacutal sample is derived from the i:th factual sample.
..
!! processed by numpydoc !!
.. py:function:: relative_proximity_score(x_native, x_factual, x_counterfactual, *, y_native=None, y_counterfactual=None, metric='euclidean', metric_params=None, average=True)
Compute relative proximity score.
The relative proximity score captures the mean proximity of counterfactual
and test sample pairs over mean proximity of the closest native
counterfactual. The lower the score, the better.
:Parameters:
**x_native** : array-like of shape (n_natives, n_timesteps)
The native counterfactual candidates. If y_counterfactual is None, the full
array is considered as possible native counterfactuals. Typically, native
counterfactual candidates correspond to samples which are labeled as the
desired counterfactual label.
**x_factual** : array-like of shape (n_counterfactuals, n_timesteps)
The factual samples, i.e., the samples for which the counterfactuals
where computed.
**x_counterfactual** : array-like of shape (n_counterfactuals, n_timesteps)
The counterfactual samples.
**y_native** : array-like of shape (n_natives, ), optional
The label of the native counterfactual candidates.
**y_counterfactual** : array-like of shape (n_counterfactuals, ), optional
The desired counterfactual label.
**metric** : str or callable, optional
The distance metric
See ``_METRICS.keys()`` for a list of supported metrics.
**metric_params** : dict, optional
Parameters to the metric.
Read more about the parameters in the
:ref:`User guide `.
**average** : bool, optional
Average the relative proximity of all labels in y_counterfactual.
:Returns:
ndarray or float
The relative proximity. If avarege=False and y_counterfactual is not None,
return the relative proximity for each counterfactual label.
.. rubric:: Notes
The samples in `x_counterfactual` and `x_factual` should be aligned such
that the i:th counterfacutal sample is derived from the i:th factual sample.
.. rubric:: References
Smyth, B., & Keane, M. T. (2021).
A Few Good Counterfactuals: Generating Interpretable, Plausible and Diverse
Counterfactual Explanations. arXiv, 2101.09056v1.
.. only:: latex
..
!! processed by numpydoc !!
.. py:function:: silhouette_samples(x, labels, *, metric='euclidean', metric_params=None)
Compute the Silhouette Coefficient of each samples.
:Parameters:
**x** : univariate time-series or multivariate time-series
The input time series.
**labels** : array-like of shape (n_samples,)
Predicted labels for each sample.
**metric** : str or callable, optional
The metric to use when calculating distance between time series.
**metric_params** : dict, optional
The metric parameters. Read more about the metrics and their parameters
in the :ref:`User guide `.
:Returns:
ndarray of shape (n_samples, )
Silhouette Coefficient for each samples.
.. rubric:: Notes
This is a convenient wrapper around :func:`sklearn.metrics.silhouette_samples`
using Wildboar native metrics.
..
!! processed by numpydoc !!
.. py:function:: silhouette_score(x, labels, *, metric='euclidean', metric_params=None, sample_size=None, random_state=None)
Compute the mean Silhouette Coefficient of all samples.
:Parameters:
**x** : univariate time-series or multivariate time-series
The input time series.
**labels** : array-like of shape (n_samples,)
Predicted labels for each sample.
**metric** : str or callable, optional
The metric to use when calculating distance between time series.
**metric_params** : dict, optional
The metric parameters. Read more about the metrics and their parameters
in the :ref:`User guide `.
**sample_size** : int, optional
The size of the sample to use when computing the Silhouette Coefficient
on a random subset of the data.
If ``sample_size is None``, no sampling is used.
**random_state** : int or RandomState, optional
Determines random number generation for selecting a subset of samples.
Used when ``sample_size is not None``.
:Returns:
float
Mean Silhouette Coefficient for all samples.
.. rubric:: Notes
This is a convenient wrapper around :func:`sklearn.metrics.silhouette_score`
using Wildboar native metrics.
..
!! processed by numpydoc !!
.. py:function:: validity_score(y_predicted, y_counterfactual, sample_weight=None)
Compute validity score.
The number counterfactuals that have the desired label.
:Parameters:
**y_predicted** : array-like of shape (n_samples, )
The predicted label.
**y_counterfactual** : array-like of shape (n_samples, )
The predicted label.
**sample_weight** : array-like of shape (n_samples, ), optional
The sample weight.
:Returns:
float
The fraction of counterfactuals with the correct label. Larger is better.
.. rubric:: References
Delaney, E., Greene, D., & Keane, M. T. (2020).
Instance-based Counterfactual Explanations for Time Series Classification.
arXiv, 2009.13211v2.
Karlsson, I., Rebane, J., Papapetrou, P., & Gionis, A. (2020).
Locally and globally explainable time series tweaking.
Knowledge and Information Systems, 62(5), 1671-1700.
.. only:: latex
..
!! processed by numpydoc !!