Back to Problems

Calculate Dice Score for Classification

Task: Compute the Dice Score

Your task is to implement a function dice_score(y_true, y_pred) that calculates the Dice Score, also known as the Sørensen–Dice coefficient or F1-score, for binary classification. The Dice Score is used to measure the similarity between two sets, particularly in tasks like image segmentation and binary classification.

Return the Dice Score rounded to 3 decimal places, and handle edge cases appropriately (e.g., when there are no true or predicted positives).

Example

Example:
y_true = np.array([1, 1, 0, 1, 0, 1])
y_pred = np.array([1, 1, 0, 0, 0, 1])
print(dice_score(y_true, y_pred))
Output: 0.857

Understanding Dice Score in Classification

The Dice Score, also known as the Sørensen–Dice coefficient or F1-score, is a statistical measure used to gauge the similarity of two samples. It's particularly popular in image segmentation tasks and binary classification problems.

Mathematical Definition

The Dice coefficient is defined as twice the intersection divided by the sum of the cardinalities of both sets:

\[ \text{Dice Score} = \frac{2|X \cap Y|}{|X| + |Y|} = \frac{2TP}{2TP + FP + FN} \]

In terms of binary classification:

  • TP (True Positives): Number of positions where both predicted and true labels are 1
  • FP (False Positives): Number of positions where prediction is 1 but true label is 0
  • FN (False Negatives): Number of positions where prediction is 0 but true label is 1

Relationship with F1-Score

The Dice coefficient is identical to the F1-score, which is the harmonic mean of precision and recall:

\[ \text{F1-score} = 2 \cdot \frac{\text{precision} \cdot \text{recall}}{\text{precision} + \text{recall}} = \text{Dice Score} \]

Key Properties

  • Range: The Dice score always falls between 0 and 1 (inclusive)
  • Perfect Score: A value of 1 indicates perfect overlap
  • No Overlap: A value of 0 indicates no overlap
  • Sensitivity: More sensitive to overlap than Jaccard Index
  • Symmetry: The score is symmetric, meaning DSC(A,B) = DSC(B,A)

Example

Consider two binary vectors:

  • True labels: [1, 1, 0, 1, 0, 1]
  • Predicted labels: [1, 1, 0, 0, 0, 1]

In this case:

  • True Positives (TP): 3
  • False Positives (FP): 0
  • False Negatives (FN): 1
  • Dice Score = (2 × 3) / (2 × 3 + 0 + 1) = 0.857

Advantages Over Jaccard Index

The Dice score has several advantages:

  • More sensitive to changes in overlap due to the doubled intersection term
  • Gives more weight to instances where labels agree
  • Often preferred in medical image segmentation due to its sensitivity to overlap
  • More intuitive interpretation as harmonic mean of precision and recall

Common Applications

The Dice score is commonly used in:

  • Medical image segmentation evaluation
  • Binary classification tasks
  • Object detection overlap assessment
  • Text similarity measurement
  • Semantic segmentation evaluation

When implementing the Dice score, it's important to handle edge cases properly, such as when both sets are empty (typically defined as 0.0 (according to sklearn)).

import numpy as np

def dice_score(y_true, y_pred):
    intersection = np.logical_and(y_true, y_pred).sum()
    true_sum = y_true.sum()
    pred_sum = y_pred.sum()

    # Handle edge cases
    if true_sum == 0 or pred_sum == 0:
        return 0.0

    dice = (2.0 * intersection) / (true_sum + pred_sum)
    return round(float(dice), 3)

There’s no video solution available yet 😔, but you can be the first to submit one at: GitHub link.

Your Solution

Output will be shown here.