Classification Metrics¶

`F1` ¶

Bases: BaseMetric

F1 score metric for classification tasks.

This class calculates the F1 score, which is the harmonic mean of precision and recall. It's particularly useful when you need a balance between precision and recall.

Key Features

Calculates F1 score for binary and multiclass classification
Supports different averaging methods (micro, macro, weighted, etc.)
Integrates with framework3's BaseMetric interface

Usage

The F1 metric can be used to evaluate classification models:

from framework3.plugins.metrics.classification import F1
from framework3.base.base_types import XYData
import numpy as np

# Create sample data
y_true = XYData(value=np.array([0, 1, 2, 0, 1, 2]))
y_pred = XYData(value=np.array([0, 2, 1, 0, 0, 1]))
x_data = XYData(value=np.array([1, 2, 3, 4, 5, 6]))

# Create and use the F1 metric
f1_metric = F1(average='macro')
score = f1_metric.evaluate(x_data, y_true, y_pred)
print(f"F1 Score: {score}")

Attributes:

Name	Type	Description
`average`	`str`	The type of averaging performed on the data. Default is 'weighted'.

Methods:

Name	Description
`evaluate`	XYData, y_true: XYData \| None, y_pred: XYData, **kwargs) -> Float \| np.ndarray: Calculate the F1 score for the given predictions and true values.

Note

This metric uses scikit-learn's f1_score function internally. Ensure that scikit-learn is properly installed and compatible with your environment.

Source code in framework3/plugins/metrics/classification.py

@Container.bind()
class F1(BaseMetric):
    """
    F1 score metric for classification tasks.

    This class calculates the F1 score, which is the harmonic mean of precision and recall.
    It's particularly useful when you need a balance between precision and recall.

    Key Features:
        - Calculates F1 score for binary and multiclass classification
        - Supports different averaging methods (micro, macro, weighted, etc.)
        - Integrates with framework3's BaseMetric interface

    Usage:
        The F1 metric can be used to evaluate classification models:

        ```python
        from framework3.plugins.metrics.classification import F1
        from framework3.base.base_types import XYData
        import numpy as np

        # Create sample data
        y_true = XYData(value=np.array([0, 1, 2, 0, 1, 2]))
        y_pred = XYData(value=np.array([0, 2, 1, 0, 0, 1]))
        x_data = XYData(value=np.array([1, 2, 3, 4, 5, 6]))

        # Create and use the F1 metric
        f1_metric = F1(average='macro')
        score = f1_metric.evaluate(x_data, y_true, y_pred)
        print(f"F1 Score: {score}")
        ```

    Attributes:
        average (str): The type of averaging performed on the data. Default is 'weighted'.

    Methods:
        evaluate(x_data: XYData, y_true: XYData | None, y_pred: XYData, **kwargs) -> Float | np.ndarray:
            Calculate the F1 score for the given predictions and true values.

    Note:
        This metric uses scikit-learn's f1_score function internally. Ensure that scikit-learn
        is properly installed and compatible with your environment.
    """

    def __init__(
        self,
        average: Literal[
            "micro", "macro", "samples", "weighted", "binary"
        ] = "weighted",
    ):
        """
        Initialize a new F1 metric instance.

        This constructor sets up the F1 metric with the specified averaging method.

        Args:
            average (Literal['micro', 'macro', 'samples', 'weighted', 'binary']): The type of averaging performed on the data. Default is 'weighted'.
                           Other options include 'micro', 'macro', 'samples', 'binary', or None.

        Note:
            The 'average' parameter is passed directly to scikit-learn's f1_score function.
            Refer to scikit-learn's documentation for detailed information on averaging methods.
        """
        super().__init__(average=average)
        self.average = average

    def evaluate(
        self,
        x_data: XYData,
        y_true: XYData | None,
        y_pred: XYData,
        **kwargs: Unpack[PrecissionKwargs],
    ) -> Float | np.ndarray:
        """
        Calculate the F1 score for the given predictions and true values.

        This method computes the F1 score, which is the harmonic mean of precision and recall.

        Args:
            x_data (XYData): The input data (not used in this metric, but required by the interface).
            y_true (XYData | None): The ground truth (correct) target values.
            y_pred (XYData): The estimated targets as returned by a classifier.
            **kwargs (Unpack[PrecissionKwargs]): Additional keyword arguments passed to sklearn's f1_score function.

        Returns:
            Float | np.ndarray: The F1 score or array of F1 scores if average is None.

        Raises:
            ValueError: If y_true is None.

        Note:
            This method uses scikit-learn's f1_score function internally with zero_division=0.
        """
        if y_true is None:
            raise ValueError("Ground truth (y_true) must be provided.")

        kwargs.setdefault(
            "average",
            cast(
                Literal["micro", "macro", "samples", "weighted", "binary"], self.average
            ),
        )
        kwargs.setdefault("zero_division", 0)

        return f1_score(
            y_true.value,
            y_pred.value,
            **kwargs,
        )  # type: ignore

`init(average='weighted')` ¶

Initialize a new F1 metric instance.

This constructor sets up the F1 metric with the specified averaging method.

Parameters:

Name	Type	Description	Default
`average`	`Literal['micro', 'macro', 'samples', 'weighted', 'binary']`	The type of averaging performed on the data. Default is 'weighted'. Other options include 'micro', 'macro', 'samples', 'binary', or None.	`'weighted'`

Note

The 'average' parameter is passed directly to scikit-learn's f1_score function. Refer to scikit-learn's documentation for detailed information on averaging methods.

Source code in framework3/plugins/metrics/classification.py

def __init__(
    self,
    average: Literal[
        "micro", "macro", "samples", "weighted", "binary"
    ] = "weighted",
):
    """
    Initialize a new F1 metric instance.

    This constructor sets up the F1 metric with the specified averaging method.

    Args:
        average (Literal['micro', 'macro', 'samples', 'weighted', 'binary']): The type of averaging performed on the data. Default is 'weighted'.
                       Other options include 'micro', 'macro', 'samples', 'binary', or None.

    Note:
        The 'average' parameter is passed directly to scikit-learn's f1_score function.
        Refer to scikit-learn's documentation for detailed information on averaging methods.
    """
    super().__init__(average=average)
    self.average = average

`evaluate(x_data, y_true, y_pred, **kwargs)` ¶

Calculate the F1 score for the given predictions and true values.

This method computes the F1 score, which is the harmonic mean of precision and recall.

Parameters:

Name	Type	Description	Default
`x_data`	`XYData`	The input data (not used in this metric, but required by the interface).	required
`y_true`	`XYData \| None`	The ground truth (correct) target values.	required
`y_pred`	`XYData`	The estimated targets as returned by a classifier.	required
`**kwargs`	`Unpack[PrecissionKwargs]`	Additional keyword arguments passed to sklearn's f1_score function.	`{}`

Returns:

Type	Description
`Float \| ndarray`	Float \| np.ndarray: The F1 score or array of F1 scores if average is None.

Raises:

Type	Description
`ValueError`	If y_true is None.

Note

This method uses scikit-learn's f1_score function internally with zero_division=0.

Source code in framework3/plugins/metrics/classification.py

def evaluate(
    self,
    x_data: XYData,
    y_true: XYData | None,
    y_pred: XYData,
    **kwargs: Unpack[PrecissionKwargs],
) -> Float | np.ndarray:
    """
    Calculate the F1 score for the given predictions and true values.

    This method computes the F1 score, which is the harmonic mean of precision and recall.

    Args:
        x_data (XYData): The input data (not used in this metric, but required by the interface).
        y_true (XYData | None): The ground truth (correct) target values.
        y_pred (XYData): The estimated targets as returned by a classifier.
        **kwargs (Unpack[PrecissionKwargs]): Additional keyword arguments passed to sklearn's f1_score function.

    Returns:
        Float | np.ndarray: The F1 score or array of F1 scores if average is None.

    Raises:
        ValueError: If y_true is None.

    Note:
        This method uses scikit-learn's f1_score function internally with zero_division=0.
    """
    if y_true is None:
        raise ValueError("Ground truth (y_true) must be provided.")

    kwargs.setdefault(
        "average",
        cast(
            Literal["micro", "macro", "samples", "weighted", "binary"], self.average
        ),
    )
    kwargs.setdefault("zero_division", 0)

    return f1_score(
        y_true.value,
        y_pred.value,
        **kwargs,
    )  # type: ignore

`Precission` ¶

Bases: BaseMetric

Precision metric for classification tasks.

This class calculates the precision score, which is the ratio tp / (tp + fp) where tp is the number of true positives and fp the number of false positives.

Key Features

Calculates precision score for binary and multiclass classification
Supports different averaging methods (micro, macro, weighted, etc.)
Integrates with framework3's BaseMetric interface

Usage

The Precission metric can be used to evaluate classification models:

from framework3.plugins.metrics.classification import Precission
from framework3.base.base_types import XYData
import numpy as np

# Create sample data
y_true = XYData(value=np.array([0, 1, 2, 0, 1, 2]))
y_pred = XYData(value=np.array([0, 2, 1, 0, 0, 1]))
x_data = XYData(value=np.array([1, 2, 3, 4, 5, 6]))

# Create and use the Precission metric
precision_metric = Precission(average='macro')
score = precision_metric.evaluate(x_data, y_true, y_pred)
print(f"Precision Score: {score}")

Attributes:

Name	Type	Description
`average`	`Literal['micro', 'macro', 'samples', 'weighted', 'binary'] \| None`	The type of averaging performed on the data. Default is 'weighted'.

Methods:

Name	Description
`evaluate`	XYData, y_true: XYData \| None, y_pred: XYData, **kwargs) -> Float \| np.ndarray: Calculate the precision score for the given predictions and true values.

Note

This metric uses scikit-learn's precision_score function internally. Ensure that scikit-learn is properly installed and compatible with your environment.

Source code in framework3/plugins/metrics/classification.py

@Container.bind()
class Precission(BaseMetric):
    """
    Precision metric for classification tasks.

    This class calculates the precision score, which is the ratio tp / (tp + fp) where tp is
    the number of true positives and fp the number of false positives.

    Key Features:
        - Calculates precision score for binary and multiclass classification
        - Supports different averaging methods (micro, macro, weighted, etc.)
        - Integrates with framework3's BaseMetric interface

    Usage:
        The Precission metric can be used to evaluate classification models:

        ```python
        from framework3.plugins.metrics.classification import Precission
        from framework3.base.base_types import XYData
        import numpy as np

        # Create sample data
        y_true = XYData(value=np.array([0, 1, 2, 0, 1, 2]))
        y_pred = XYData(value=np.array([0, 2, 1, 0, 0, 1]))
        x_data = XYData(value=np.array([1, 2, 3, 4, 5, 6]))

        # Create and use the Precission metric
        precision_metric = Precission(average='macro')
        score = precision_metric.evaluate(x_data, y_true, y_pred)
        print(f"Precision Score: {score}")
        ```

    Attributes:
        average (Literal["micro", "macro", "samples", "weighted", "binary"]|None): The type of averaging performed on the data. Default is 'weighted'.

    Methods:
        evaluate (x_data: XYData, y_true: XYData | None, y_pred: XYData, **kwargs) -> Float | np.ndarray:
            Calculate the precision score for the given predictions and true values.

    Note:
        This metric uses scikit-learn's precision_score function internally. Ensure that scikit-learn
        is properly installed and compatible with your environment.
    """

    def __init__(
        self,
        average: Literal["micro", "macro", "samples", "weighted", "binary"]
        | None = "weighted",
    ):
        """
        Initialize a new Precission metric instance.

        This constructor sets up the Precission metric with the specified averaging method.

        Args:
            average (Literal["micro", "macro", "samples", "weighted", "binary"]|None): The type of averaging performed on the data. Default is 'weighted'.
                                  Options are 'micro', 'macro', 'samples', 'weighted', 'binary', or None.

        Note:
            The 'average' parameter is passed directly to scikit-learn's precision_score function.
            Refer to scikit-learn's documentation for detailed information on averaging methods.
        """
        super().__init__(average=average)

    def evaluate(
        self,
        x_data: XYData,
        y_true: XYData | None,
        y_pred: XYData,
        **kwargs: Unpack[PrecissionKwargs],
    ) -> Float | np.ndarray:
        """
        Calculate the precision score for the given predictions and true values.

        This method computes the precision score, which is the ratio of true positives to the
        sum of true and false positives.

        Args:
            x_data (XYData): The input data (not used in this metric, but required by the interface).
            y_true (XYData | None): The ground truth (correct) target values.
            y_pred (XYData): The estimated targets as returned by a classifier.
            **kwargs (Unpack[PrecissionKwargs]): Additional keyword arguments passed to sklearn's precision_score function.

        Returns:
            Float | np.ndarray: The precision score or array of precision scores if average is None.

        Raises:
            ValueError: If y_true is None.

        Note:
            This method uses scikit-learn's precision_score function internally with zero_division=0.
        """
        if y_true is None:
            raise ValueError("Ground truth (y_true) must be provided.")
        return precision_score(
            y_true.value,
            y_pred.value,
            zero_division=0,
            average=self.average,
            **kwargs,  # type: ignore
        )  # type: ignore

`init(average='weighted')` ¶

Initialize a new Precission metric instance.

This constructor sets up the Precission metric with the specified averaging method.

Parameters:

Name	Type	Description	Default
`average`	`Literal['micro', 'macro', 'samples', 'weighted', 'binary'] \| None`	The type of averaging performed on the data. Default is 'weighted'. Options are 'micro', 'macro', 'samples', 'weighted', 'binary', or None.	`'weighted'`

Note

The 'average' parameter is passed directly to scikit-learn's precision_score function. Refer to scikit-learn's documentation for detailed information on averaging methods.

Source code in framework3/plugins/metrics/classification.py

def __init__(
    self,
    average: Literal["micro", "macro", "samples", "weighted", "binary"]
    | None = "weighted",
):
    """
    Initialize a new Precission metric instance.

    This constructor sets up the Precission metric with the specified averaging method.

    Args:
        average (Literal["micro", "macro", "samples", "weighted", "binary"]|None): The type of averaging performed on the data. Default is 'weighted'.
                              Options are 'micro', 'macro', 'samples', 'weighted', 'binary', or None.

    Note:
        The 'average' parameter is passed directly to scikit-learn's precision_score function.
        Refer to scikit-learn's documentation for detailed information on averaging methods.
    """
    super().__init__(average=average)

`evaluate(x_data, y_true, y_pred, **kwargs)` ¶

Calculate the precision score for the given predictions and true values.

This method computes the precision score, which is the ratio of true positives to the sum of true and false positives.

Parameters:

Name	Type	Description	Default
`x_data`	`XYData`	The input data (not used in this metric, but required by the interface).	required
`y_true`	`XYData \| None`	The ground truth (correct) target values.	required
`y_pred`	`XYData`	The estimated targets as returned by a classifier.	required
`**kwargs`	`Unpack[PrecissionKwargs]`	Additional keyword arguments passed to sklearn's precision_score function.	`{}`

Returns:

Type	Description
`Float \| ndarray`	Float \| np.ndarray: The precision score or array of precision scores if average is None.

Raises:

Type	Description
`ValueError`	If y_true is None.

Note

This method uses scikit-learn's precision_score function internally with zero_division=0.

Source code in framework3/plugins/metrics/classification.py

def evaluate(
    self,
    x_data: XYData,
    y_true: XYData | None,
    y_pred: XYData,
    **kwargs: Unpack[PrecissionKwargs],
) -> Float | np.ndarray:
    """
    Calculate the precision score for the given predictions and true values.

    This method computes the precision score, which is the ratio of true positives to the
    sum of true and false positives.

    Args:
        x_data (XYData): The input data (not used in this metric, but required by the interface).
        y_true (XYData | None): The ground truth (correct) target values.
        y_pred (XYData): The estimated targets as returned by a classifier.
        **kwargs (Unpack[PrecissionKwargs]): Additional keyword arguments passed to sklearn's precision_score function.

    Returns:
        Float | np.ndarray: The precision score or array of precision scores if average is None.

    Raises:
        ValueError: If y_true is None.

    Note:
        This method uses scikit-learn's precision_score function internally with zero_division=0.
    """
    if y_true is None:
        raise ValueError("Ground truth (y_true) must be provided.")
    return precision_score(
        y_true.value,
        y_pred.value,
        zero_division=0,
        average=self.average,
        **kwargs,  # type: ignore
    )  # type: ignore

`Recall` ¶

Bases: BaseMetric

Recall metric for classification tasks.

This class calculates the recall score, which is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives.

Key Features

Calculates recall score for binary and multiclass classification
Supports different averaging methods (micro, macro, weighted, etc.)
Integrates with framework3's BaseMetric interface

Usage

The Recall metric can be used to evaluate classification models:

from framework3.plugins.metrics.classification import Recall
from framework3.base.base_types import XYData
import numpy as np

# Create sample data
y_true = XYData(value=np.array([0, 1, 2, 0, 1, 2]))
y_pred = XYData(value=np.array([0, 2, 1, 0, 0, 1]))
x_data = XYData(value=np.array([1, 2, 3, 4, 5, 6]))

# Create and use the Recall metric
recall_metric = Recall(average='macro')
score = recall_metric.evaluate(x_data, y_true, y_pred)
print(f"Recall Score: {score}")

Attributes:

Name	Type	Description
`average`	`str \| None`	The type of averaging performed on the data. Default is 'weighted'.

Methods:

Name	Description
`evaluate`	XYData, y_true: XYData \| None, y_pred: XYData, **kwargs) -> Float \| np.ndarray: Calculate the recall score for the given predictions and true values.

Note

This metric uses scikit-learn's recall_score function internally. Ensure that scikit-learn is properly installed and compatible with your environment.

Source code in framework3/plugins/metrics/classification.py

@Container.bind()
class Recall(BaseMetric):
    """
    Recall metric for classification tasks.

    This class calculates the recall score, which is the ratio tp / (tp + fn) where tp is
    the number of true positives and fn the number of false negatives.

    Key Features:
        - Calculates recall score for binary and multiclass classification
        - Supports different averaging methods (micro, macro, weighted, etc.)
        - Integrates with framework3's BaseMetric interface

    Usage:
        The Recall metric can be used to evaluate classification models:

        ```python
        from framework3.plugins.metrics.classification import Recall
        from framework3.base.base_types import XYData
        import numpy as np

        # Create sample data
        y_true = XYData(value=np.array([0, 1, 2, 0, 1, 2]))
        y_pred = XYData(value=np.array([0, 2, 1, 0, 0, 1]))
        x_data = XYData(value=np.array([1, 2, 3, 4, 5, 6]))

        # Create and use the Recall metric
        recall_metric = Recall(average='macro')
        score = recall_metric.evaluate(x_data, y_true, y_pred)
        print(f"Recall Score: {score}")
        ```

    Attributes:
        average (str | None): The type of averaging performed on the data. Default is 'weighted'.

    Methods:
        evaluate(x_data: XYData, y_true: XYData | None, y_pred: XYData, **kwargs) -> Float | np.ndarray:
            Calculate the recall score for the given predictions and true values.

    Note:
        This metric uses scikit-learn's recall_score function internally. Ensure that scikit-learn
        is properly installed and compatible with your environment.
    """

    def __init__(
        self,
        average: Literal["micro", "macro", "samples", "weighted", "binary"]
        | None = "weighted",
    ):
        """
        Initialize a new Recall metric instance.

        This constructor sets up the Recall metric with the specified averaging method.

        Args:
            average (str | None): The type of averaging performed on the data. Default is 'weighted'.
                                  Options are 'micro', 'macro', 'samples', 'weighted', 'binary', or None.

        Note:
            The 'average' parameter is passed directly to scikit-learn's recall_score function.
            Refer to scikit-learn's documentation for detailed information on averaging methods.
        """
        super().__init__(average=average)

    def evaluate(
        self,
        x_data: XYData,
        y_true: XYData | None,
        y_pred: XYData,
        **kwargs: Unpack[PrecissionKwargs],
    ) -> Float | np.ndarray:
        """
        Calculate the recall score for the given predictions and true values.

        This method computes the recall score, which is the ratio of true positives to the
        sum of true positives and false negatives.

        Args:
            x_data (XYData): The input data (not used in this metric, but required by the interface).
            y_true (XYData | None): The ground truth (correct) target values.
            y_pred (XYData): The estimated targets as returned by a classifier.
            **kwargs (Unpack[PrecissionKwargs]): Additional keyword arguments passed to sklearn's recall_score function.

        Returns:
            Float | np.ndarray: The recall score or array of recall scores if average is None.

        Raises:
            ValueError: If y_true is None.

        Note:
            This method uses scikit-learn's recall_score function internally with zero_division=0.
        """
        if y_true is None:
            raise ValueError("Ground truth (y_true) must be provided.")
        return recall_score(
            y_true.value,
            y_pred.value,
            zero_division=0,
            average=self.average,
            **kwargs,  # type: ignore
        )  # type: ignore

`init(average='weighted')` ¶

Initialize a new Recall metric instance.

This constructor sets up the Recall metric with the specified averaging method.

Parameters:

Name	Type	Description	Default
`average`	`str \| None`	The type of averaging performed on the data. Default is 'weighted'. Options are 'micro', 'macro', 'samples', 'weighted', 'binary', or None.	`'weighted'`

Note

The 'average' parameter is passed directly to scikit-learn's recall_score function. Refer to scikit-learn's documentation for detailed information on averaging methods.

Source code in framework3/plugins/metrics/classification.py

def __init__(
    self,
    average: Literal["micro", "macro", "samples", "weighted", "binary"]
    | None = "weighted",
):
    """
    Initialize a new Recall metric instance.

    This constructor sets up the Recall metric with the specified averaging method.

    Args:
        average (str | None): The type of averaging performed on the data. Default is 'weighted'.
                              Options are 'micro', 'macro', 'samples', 'weighted', 'binary', or None.

    Note:
        The 'average' parameter is passed directly to scikit-learn's recall_score function.
        Refer to scikit-learn's documentation for detailed information on averaging methods.
    """
    super().__init__(average=average)

`evaluate(x_data, y_true, y_pred, **kwargs)` ¶

Calculate the recall score for the given predictions and true values.

This method computes the recall score, which is the ratio of true positives to the sum of true positives and false negatives.

Parameters:

Name	Type	Description	Default
`x_data`	`XYData`	The input data (not used in this metric, but required by the interface).	required
`y_true`	`XYData \| None`	The ground truth (correct) target values.	required
`y_pred`	`XYData`	The estimated targets as returned by a classifier.	required
`**kwargs`	`Unpack[PrecissionKwargs]`	Additional keyword arguments passed to sklearn's recall_score function.	`{}`

Returns:

Type	Description
`Float \| ndarray`	Float \| np.ndarray: The recall score or array of recall scores if average is None.

Raises:

Type	Description
`ValueError`	If y_true is None.

Note

This method uses scikit-learn's recall_score function internally with zero_division=0.

Source code in framework3/plugins/metrics/classification.py

def evaluate(
    self,
    x_data: XYData,
    y_true: XYData | None,
    y_pred: XYData,
    **kwargs: Unpack[PrecissionKwargs],
) -> Float | np.ndarray:
    """
    Calculate the recall score for the given predictions and true values.

    This method computes the recall score, which is the ratio of true positives to the
    sum of true positives and false negatives.

    Args:
        x_data (XYData): The input data (not used in this metric, but required by the interface).
        y_true (XYData | None): The ground truth (correct) target values.
        y_pred (XYData): The estimated targets as returned by a classifier.
        **kwargs (Unpack[PrecissionKwargs]): Additional keyword arguments passed to sklearn's recall_score function.

    Returns:
        Float | np.ndarray: The recall score or array of recall scores if average is None.

    Raises:
        ValueError: If y_true is None.

    Note:
        This method uses scikit-learn's recall_score function internally with zero_division=0.
    """
    if y_true is None:
        raise ValueError("Ground truth (y_true) must be provided.")
    return recall_score(
        y_true.value,
        y_pred.value,
        zero_division=0,
        average=self.average,
        **kwargs,  # type: ignore
    )  # type: ignore

Overview¶

The Classification Metrics module in framework3 provides a set of evaluation metrics specifically designed for assessing the performance of classification models. These metrics help in understanding various aspects of a classifier's performance, such as accuracy, precision, recall, and F1-score.

Available Classification Metrics¶

Accuracy Score¶

The Accuracy Score is implemented in the AccuracyScoreMetric. It computes the accuracy of a classification model by comparing the predicted labels with the true labels.

Usage¶

from framework3.plugins.metrics.classification.accuracy_score import AccuracyScoreMetric

accuracy_metric = AccuracyScoreMetric()
score = accuracy_metric.compute(y_true, y_pred)

Precision Score¶

The Precision Score is implemented in the PrecisionScoreMetric. It computes the precision of a classification model, which is the ratio of true positive predictions to the total number of positive predictions.

Usage¶

from framework3.plugins.metrics.classification.precision_score import PrecisionScoreMetric

precision_metric = PrecisionScoreMetric(average='weighted')
score = precision_metric.compute(y_true, y_pred)

Parameters¶

average (str): The averaging method. Options include 'micro', 'macro', 'weighted', 'samples', and None.

Recall Score¶

The Recall Score is implemented in the RecallScoreMetric. It computes the recall of a classification model, which is the ratio of true positive predictions to the total number of actual positive instances.

Usage¶

from framework3.plugins.metrics.classification.recall_score import RecallScoreMetric

recall_metric = RecallScoreMetric(average='weighted')
score = recall_metric.compute(y_true, y_pred)

Parameters¶

average (str): The averaging method. Options include 'micro', 'macro', 'weighted', 'samples', and None.

F1 Score¶

The F1 Score is implemented in the F1ScoreMetric. It computes the F1 score, which is the harmonic mean of precision and recall.

Usage¶

from framework3.plugins.metrics.classification.f1_score import F1ScoreMetric

f1_metric = F1ScoreMetric(average='weighted')
score = f1_metric.compute(y_true, y_pred)

Parameters¶

average (str): The averaging method. Options include 'micro', 'macro', 'weighted', 'samples', and None.

Comprehensive Example: Evaluating a Classification Model¶

In this example, we'll demonstrate how to use the Classification Metrics to evaluate the performance of a classification model.

from framework3.plugins.filters.classification.svm import ClassifierSVMPlugin
from framework3.plugins.metrics.classification.accuracy_score import AccuracyScoreMetric
from framework3.plugins.metrics.classification.precision_score import PrecisionScoreMetric
from framework3.plugins.metrics.classification.recall_score import RecallScoreMetric
from framework3.plugins.metrics.classification.f1_score import F1ScoreMetric
from framework3.base.base_types import XYData
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load the Iris dataset
iris = load_iris()
X, y = iris.data, iris.target

# Split the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Create XYData objects
X_train_data = XYData(_hash='X_train', _path='/tmp', _value=X_train)
y_train_data = XYData(_hash='y_train', _path='/tmp', _value=y_train)
X_test_data = XYData(_hash='X_test', _path='/tmp', _value=X_test)
y_test_data = XYData(_hash='y_test', _path='/tmp', _value=y_test)

# Create and train the classifier
classifier = ClassifierSVMPlugin(kernel='rbf')
classifier.fit(X_train_data, y_train_data)

# Make predictions
predictions = classifier.predict(X_test_data)

# Initialize metrics
accuracy_metric = AccuracyScoreMetric()
precision_metric = PrecisionScoreMetric(average='weighted')
recall_metric = RecallScoreMetric(average='weighted')
f1_metric = F1ScoreMetric(average='weighted')

# Compute metrics
accuracy = accuracy_metric.compute(y_test_data, predictions)
precision = precision_metric.compute(y_test_data, predictions)
recall = recall_metric.compute(y_test_data, predictions)
f1 = f1_metric.compute(y_test_data, predictions)

# Print results
print(f"Accuracy: {accuracy}")
print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1 Score: {f1}")

This example demonstrates how to:

Load and prepare the Iris dataset
Create XYData objects for use with framework3
Train an SVM classifier
Make predictions on the test set
Initialize and compute various classification metrics
Print the evaluation results

Best Practices¶

Multiple Metrics: Use multiple metrics to get a comprehensive view of your model's performance. Different metrics capture different aspects of classification performance.
Class Imbalance: Be aware of class imbalance in your dataset. In such cases, accuracy alone might not be a good metric. Consider using precision, recall, and F1-score.
Averaging Methods: When dealing with multi-class classification, pay attention to the averaging method used in metrics like precision, recall, and F1-score. 'Weighted' average is often a good choice for imbalanced datasets.
Cross-Validation: Use cross-validation to get a more robust estimate of your model's performance, especially with smaller datasets.
Confusion Matrix: Consider using a confusion matrix in addition to these metrics for a more detailed view of your model's performance across different classes.
ROC AUC: For binary classification problems, consider using the ROC AUC score as an additional metric.
Threshold Adjustment: Remember that metrics like precision and recall can be affected by adjusting the classification threshold. Consider exploring different thresholds if needed.
Domain-Specific Metrics: Depending on your specific problem, you might need to implement custom metrics that are more relevant to your domain.

Conclusion¶

The Classification Metrics module in framework3 provides essential tools for evaluating the performance of classification models. By using these metrics in combination with other framework3 components, you can gain valuable insights into your model's strengths and weaknesses. The example demonstrates how easy it is to compute and interpret these metrics within the framework3 ecosystem, enabling you to make informed decisions about your classification models.

Classification Metrics¶

F1 ¶

__init__(average='weighted') ¶

evaluate(x_data, y_true, y_pred, **kwargs) ¶

Precission ¶

__init__(average='weighted') ¶

evaluate(x_data, y_true, y_pred, **kwargs) ¶

Recall ¶

__init__(average='weighted') ¶

evaluate(x_data, y_true, y_pred, **kwargs) ¶

Overview¶

Available Classification Metrics¶

Accuracy Score¶

Usage¶

Precision Score¶

Usage¶

Parameters¶

Recall Score¶

Usage¶

Parameters¶

F1 Score¶

Usage¶

Parameters¶

Comprehensive Example: Evaluating a Classification Model¶

Best Practices¶

Conclusion¶

`F1` ¶

`init(average='weighted')` ¶

`evaluate(x_data, y_true, y_pred, **kwargs)` ¶

`Precission` ¶

`init(average='weighted')` ¶

`evaluate(x_data, y_true, y_pred, **kwargs)` ¶

`Recall` ¶

`init(average='weighted')` ¶

`evaluate(x_data, y_true, y_pred, **kwargs)` ¶