Example: print(leaky_relu(0)) # Output: 0 print(leaky_relu(1)) # Output: 1 print(leaky_relu(-1)) # Output: -0.01 print(leaky_relu(-2, alpha=0.1)) # Output: -0.2
The Leaky ReLU (Leaky Rectified Linear Unit) activation function is a variant of the ReLU function used in neural networks. It addresses the "dying ReLU" problem by allowing a small, non-zero gradient when the input is negative. This small slope for negative inputs helps keep the function active and helps prevent neurons from becoming inactive.
The Leaky ReLU function is mathematically defined as:
\[ f(z) = \begin{cases} z & \text{if } z > 0 \\ \alpha z & \text{if } z \leq 0 \end{cases} \]Where \(z\) is the input to the function and \(\alpha\) is a small positive constant, typically \(\alpha = 0.01\).
In this definition, the function returns \(z\) for positive values, and for negative values, it returns \(\alpha z\), allowing a small gradient to pass through.
This function is particularly useful in deep learning models as it mitigates the issue of "dead neurons" in ReLU by ensuring that neurons can still propagate a gradient even when the input is negative, helping to improve learning dynamics in the network.
def leaky_relu(z: float, alpha: float = 0.01) -> float|int: return z if z > 0 else alpha * z