Your task is to implement a function jaccard_index(y_true, y_pred)
that calculates the Jaccard Index, a measure of similarity between two binary sets. The Jaccard Index is widely used in binary classification tasks to evaluate the overlap between predicted and true labels.
The function should handle cases where there is no overlap or when both arrays contain only zeros.
Example: y_true = np.array([1, 0, 1, 1, 0, 1]) y_pred = np.array([1, 0, 1, 0, 0, 1]) print(jaccard_index(y_true, y_pred)) Output: 0.75
The Jaccard Index, also known as the Jaccard Similarity Coefficient, is a statistic used to measure the similarity between sets. In the context of binary classification, it measures the overlap between predicted and actual positive labels.
The Jaccard Index is defined as the size of the intersection divided by the size of the union of two sets:
In the context of binary classification:
Consider two binary vectors:
In this case:
The Jaccard Index is particularly useful in:
When implementing the Jaccard Index, it's important to handle edge cases, such as when both sets are empty (in which case the index is typically defined as 0).
import numpy as np def jaccard_index(y_true, y_pred): intersection = np.sum((y_true == 1) & (y_pred == 1)) union = np.sum((y_true == 1) | (y_pred == 1)) result = intersection / union if np.isnan(result): return 0.0 return round(result, 3)
There’s no video solution available yet 😔, but you can be the first to submit one at: GitHub link.