************************** :py:mod:`wildboar.metrics` ************************** .. py:module:: wildboar.metrics .. autoapi-nested-parse:: Evaluation metrics. .. !! processed by numpydoc !! Package Contents ---------------- Functions --------- .. autoapisummary:: wildboar.metrics.compactness_score wildboar.metrics.plausability_score wildboar.metrics.proximity_score wildboar.metrics.redudancy_score wildboar.metrics.relative_proximity_score wildboar.metrics.silhouette_samples wildboar.metrics.silhouette_score wildboar.metrics.validity_score .. py:function:: compactness_score(x_factual, x_counterfactual, *, window=None, n_bins=None, atol=1e-08, average=True) Compute compactness score. The compactness of counterfactuals as measured by the fraction of changed timesteps. The fewer timesteps have changed between the original and the counterfactual, the lower the score. :Parameters: **x_factual** : array-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timeteps) The true samples. **x_counterfactual** : array-like of shape (n_samples, n_timesteps) or (n_samples, n_dims, n_timeteps) The counterfactual samples. **window** : int, optional If set, evaluate the difference between windows of specified size. **n_bins** : int, optional If set, evaluate the set overlap of SAX transformed series. **atol** : float, optional The absolute tolerance. **average** : bool, optional Compute average score over all dimensions. :Returns: float The compactness score. Lower score indicates more compact counterfactuals. .. rubric:: Notes The samples in `x_counterfactual` and `x_factual` should be aligned such that the i:th counterfacutal sample is derived from the i:th factual sample. .. rubric:: References Karlsson, I., Rebane, J., Papapetrou, P., & Gionis, A. (2020). Locally and globally explainable time series tweaking. Knowledge and Information Systems, 62(5), 1671-1700. .. only:: latex .. !! processed by numpydoc !! .. py:function:: plausability_score(x_plausible, x_counterfactuals, *, y_plausible=None, y_counterfactual=None, estimator=None, method='accuracy', average=True) Compute plausibility score. :Parameters: **x_plausible** : array-like of shape (n_samples, n_timesteps) The plausible samples, typically the training or testing samples. **x_counterfactuals** : array-like of shape (m_samples, n_timesteps) The counterfactual samples. **y_plausible** : array-like of shape (n_samples, ), optional The labels of the plausible samples. **y_counterfactual** : array-like of shape (m_samples, ), optional The desired label of the counterfactuals. **estimator** : estimator, optional The outlier estimator, must implement `fit` and `predict`. If None, we use LocalOutlierFactor. - if score="mean", the estimator must also implement `decision_function`. **method** : {'score', 'accuracy'}, optional The score function. **average** : bool, optional If True, return the average score for all labels in y_counterfactual; otherwise, return the score for the individual labels (ordered as np.unique). :Returns: ndarray or float The plausability. - if method='scores', the mean score is returned, with larger score incicating better performance. - if method='accuracy', the fraction of plausible counterfactuals are returned. - if y_counterfactual is None and average=False, the scores or accuracy for each counterfactual label is returned. .. rubric:: References Delaney, E., Greene, D., & Keane, M. T. (2020). Instance-based Counterfactual Explanations for Time Series Classification. arXiv, 2009.13211v2. .. only:: latex .. !! processed by numpydoc !! .. py:function:: proximity_score(x_factual, x_counterfactual, metric='normalized_euclidean', metric_params=None) Compute proximity score. The closer the counterfactual is to the original, the lower the score. :Parameters: **x_factual** : array-like of shape (n_samples, n_timestep) The true samples. **x_counterfactual** : array-like of shape (n_samples, n_timestep) The counterfactual samples. **metric** : str or callable, optional The distance metric See ``_METRICS.keys()`` for a list of supported metrics. **metric_params** : dict, optional Parameters to the metric. Read more about the parameters in the :ref:`User guide `. :Returns: float The mean proximity. .. rubric:: Notes The samples in `x_counterfactual` and `x_factual` should be aligned such that the i:th counterfacutal sample is derived from the i:th factual sample. .. rubric:: References Delaney, E., Greene, D., & Keane, M. T. (2020). Instance-based Counterfactual Explanations for Time Series Classification. arXiv, 2009.13211v2. Karlsson, I., Rebane, J., Papapetrou, P., & Gionis, A. (2020). Locally and globally explainable time series tweaking. Knowledge and Information Systems, 62(5), 1671-1700. .. only:: latex .. !! processed by numpydoc !! .. py:function:: redudancy_score(estimator, x_factual, x_counterfactual, y_counterfactual, *, n_intervals='sqrt', window=None, average=True) Compute the redudancy score. Redundancy is measure of how much impact non-overlapping intervals has in the construction of the counterfactuals. :Parameters: **estimator** : Estimator The estimator counterfactuals are computed for. **x_factual** : array-like of shape (n_samples, n_timestep) The factual samples, i.e., samples for which counterfactuals are computed. **x_counterfactual** : array-like of shape (n_samples, n_timestep) The counterfactual samples. **y_counterfactual** : array-like of shape (n_samples, ) The desired counterfactual label. **n_intervals** : {"sqrt", "log2"}, int or float, optional The number of intervals. **window** : int, optional The size of an interval. If set, `n_intervals` is ignored. **average** : bool, optional Return the average redundancy over all intervals. :Returns: ndarray of shape (n_intervals, ) or float The redundancy of each interval, expressed as the fraction of samples that have the same label if the interval is replaced with the corresponding interval of the factual sample. If `average` is True, return a single float. .. rubric:: Notes The samples in `x_counterfactual` and `x_factual` should be aligned such that the i:th counterfacutal sample is derived from the i:th factual sample. .. !! processed by numpydoc !! .. py:function:: relative_proximity_score(x_native, x_factual, x_counterfactual, *, y_native=None, y_counterfactual=None, metric='euclidean', metric_params=None, average=True) Compute relative proximity score. The relative proximity score captures the mean proximity of counterfactual and test sample pairs over mean proximity of the closest native counterfactual. The lower the score, the better. :Parameters: **x_native** : array-like of shape (n_natives, n_timesteps) The native counterfactual candidates. If y_counterfactual is None, the full array is considered as possible native counterfactuals. Typically, native counterfactual candidates correspond to samples which are labeled as the desired counterfactual label. **x_factual** : array-like of shape (n_counterfactuals, n_timesteps) The factual samples, i.e., the samples for which the counterfactuals where computed. **x_counterfactual** : array-like of shape (n_counterfactuals, n_timesteps) The counterfactual samples. **y_native** : array-like of shape (n_natives, ), optional The label of the native counterfactual candidates. **y_counterfactual** : array-like of shape (n_counterfactuals, ), optional The desired counterfactual label. **metric** : str or callable, optional The distance metric See ``_METRICS.keys()`` for a list of supported metrics. **metric_params** : dict, optional Parameters to the metric. Read more about the parameters in the :ref:`User guide `. **average** : bool, optional Average the relative proximity of all labels in y_counterfactual. :Returns: ndarray or float The relative proximity. If avarege=False and y_counterfactual is not None, return the relative proximity for each counterfactual label. .. rubric:: Notes The samples in `x_counterfactual` and `x_factual` should be aligned such that the i:th counterfacutal sample is derived from the i:th factual sample. .. rubric:: References Smyth, B., & Keane, M. T. (2021). A Few Good Counterfactuals: Generating Interpretable, Plausible and Diverse Counterfactual Explanations. arXiv, 2101.09056v1. .. only:: latex .. !! processed by numpydoc !! .. py:function:: silhouette_samples(x, labels, *, metric='euclidean', metric_params=None) Compute the Silhouette Coefficient of each samples. :Parameters: **x** : univariate time-series or multivariate time-series The input time series. **labels** : array-like of shape (n_samples,) Predicted labels for each sample. **metric** : str or callable, optional The metric to use when calculating distance between time series. **metric_params** : dict, optional The metric parameters. Read more about the metrics and their parameters in the :ref:`User guide `. :Returns: ndarray of shape (n_samples, ) Silhouette Coefficient for each samples. .. rubric:: Notes This is a convenient wrapper around :ref:`sklearn.metrics.silhouette_samples` using Wildboar native metrics. .. !! processed by numpydoc !! .. py:function:: silhouette_score(x, labels, *, metric='euclidean', metric_params=None, sample_size=None, random_state=None) Compute the mean Silhouette Coefficient of all samples. :Parameters: **x** : univariate time-series or multivariate time-series The input time series. **labels** : array-like of shape (n_samples,) Predicted labels for each sample. **metric** : str or callable, optional The metric to use when calculating distance between time series. **metric_params** : dict, optional The metric parameters. Read more about the metrics and their parameters in the :ref:`User guide `. **sample_size** : int, optional The size of the sample to use when computing the Silhouette Coefficient on a random subset of the data. If ``sample_size is None``, no sampling is used. **random_state** : int or RandomState, optional Determines random number generation for selecting a subset of samples. Used when ``sample_size is not None``. :Returns: float Mean Silhouette Coefficient for all samples. .. rubric:: Notes This is a convenient wrapper around :ref:`sklearn.metrics.silhouette_score` using Wildboar native metrics. .. !! processed by numpydoc !! .. py:function:: validity_score(y_predicted, y_counterfactual, sample_weight=None) Compute validity score. The number counterfactuals that have the desired label. :Parameters: **y_predicted** : array-like of shape (n_samples, ) The predicted label. **y_counterfactual** : array-like of shape (n_samples, ) The predicted label. **sample_weight** : array-like of shape (n_samples, ), optional The sample weight. :Returns: float The fraction of counterfactuals with the correct label. Larger is better. .. rubric:: References Delaney, E., Greene, D., & Keane, M. T. (2020). Instance-based Counterfactual Explanations for Time Series Classification. arXiv, 2009.13211v2. Karlsson, I., Rebane, J., Papapetrou, P., & Gionis, A. (2020). Locally and globally explainable time series tweaking. Knowledge and Information Systems, 62(5), 1671-1700. .. only:: latex .. !! processed by numpydoc !!