ctc_loss

ctc_loss#

class brainpy.losses.ctc_loss(logits, logit_paddings, labels, label_paddings, blank_id=0, log_epsilon=-100000.0)[source]#

Computes CTC loss.

See docstring for ctc_loss_with_forward_probs for details.

Parameters:

logits (TypeVar(ArrayType, Array, Variable, TrainVar, Array, ndarray)) – (B, T, K)-array containing logits of each class where B denotes the batch size, T denotes the max time frames in logits, and K denotes the number of classes including a class for blanks.
logit_paddings (TypeVar(ArrayType, Array, Variable, TrainVar, Array, ndarray)) – (B, T)-array. Padding indicators for logits. Each element must be either 1.0 or 0.0, and logitpaddings[b, t] == 1.0 denotes that logits[b, t, :] are padded values.
labels (TypeVar(ArrayType, Array, Variable, TrainVar, Array, ndarray)) – (B, N)-array containing reference integer labels where N denotes the max time frames in the label sequence.
label_paddings (TypeVar(ArrayType, Array, Variable, TrainVar, Array, ndarray)) – (B, N)-array. Padding indicators for labels. Each element must be either 1.0 or 0.0, and labelpaddings[b, n] == 1.0 denotes that labels[b, n] is a padded label. In the current implementation, labels must be right-padded, i.e. each row labelpaddings[b, :] must be repetition of zeroes, followed by repetition of ones.
blank_id (int) – Id for blank token. logits[b, :, blank_id] are used as probabilities of blank symbols.
log_epsilon (float) – Numerically-stable approximation of log(+0).

Return type:

TypeVar(ArrayType, Array, Variable, TrainVar, Array, ndarray)

Returns:

(B,)-array containing loss values for each sequence in the batch.