Example: A = np.array([1, 2, 3]) print(log_softmax(A)) Output: array([-2.4076, -1.4076, -0.4076])
The log softmax function is a numerically stable way of calculating the logarithm of the softmax function. The softmax function converts a vector of arbitrary values (logits) into a vector of probabilities, where each value lies between 0 and 1, and the values sum to 1. The softmax function is given by:
\[ \text{softmax}(x_i) = \frac{e^{x_i}}{\sum_{j=1}^n e^{x_j}} \]However, directly applying the logarithm to the softmax function can lead to numerical instability, especially when dealing with large numbers. To prevent this, we use the log-softmax function, which incorporates a shift by subtracting the maximum value from the input vector:
\[ \text{log softmax}(x_i) = x_i - \max(x) - \log\left(\sum_{j=1}^n e^{x_j - \max(x)}\right) \]This formulation helps to avoid overflow issues that can occur when exponentiating large numbers. The log-softmax function is particularly useful in machine learning for calculating probabilities in a stable manner, especially when used with cross-entropy loss functions.
import numpy as np def log_softmax(scores: list) -> np.ndarray: # Subtract the maximum value for numerical stability scores = scores - np.max(scores) return scores - np.log(np.sum(np.exp(scores)))