ELU and SELU Activations

ELU (Exponential Linear Unit) fixes the dying ReLU problem by using a smooth negative region instead of hard-zeroing. For negative inputs, ELU outputs a small negative value that saturates toward -alpha, allowing mean activations to stay closer to zero.

SELU (Scaled ELU) is a self-normalizing variant — activations in deep networks automatically converge to zero mean and unit variance, eliminating the need for explicit normalization layers.

ELU:

f(x) = x            if x > 0
f(x) = alpha * (exp(x) - 1)   if x <= 0

Default: alpha = 1.0

SELU:

f(x) = scale * x                         if x > 0
f(x) = scale * alpha * (exp(x) - 1)      if x <= 0

Fixed constants: scale = 1.0507009873554804, alpha = 1.6732631921096593

Signature: def elu_selu(x, mode='elu', alpha=1.0)

x: input array (any shape)
mode: 'elu' or 'selu'
alpha: slope parameter for ELU (ignored for SELU, which uses its fixed alpha)
Returns: same shape as x

Math

Asked at