ctc_loss_with_forward_probs#
- class brainpy.losses.ctc_loss_with_forward_probs(logits, logit_paddings, labels, label_paddings, blank_id=0, log_epsilon=-100000.0)[source]#
Computes CTC loss and CTC forward-probabilities.
The CTC loss is a loss function based on log-likelihoods of the model that introduces a special blank symbol \(\phi\) to represent variable-length output sequences.
Forward probabilities returned by this function, as auxiliary results, are grouped into two part: blank alpha-probability and non-blank alpha probability. Those are defined as follows:
\[\begin{split}\alpha_{\mathrm{BLANK}}(t, n) = \sum_{\pi_{1:t-1}} p(\pi_t = \phi | \pi_{1:t-1}, y_{1:n-1}, \cdots), \\ \alpha_{\mathrm{LABEL}}(t, n) = \sum_{\pi_{1:t-1}} p(\pi_t = y_n | \pi_{1:t-1}, y_{1:n-1}, \cdots).\end{split}\]Here, \(\pi\) denotes the alignment sequence in the reference [Graves et al, 2006] that is blank-inserted representations of
labels. The return values are the logarithms of the above probabilities.- Parameters:
logits (
TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray)) – (B, T, K)-array containing logits of each class where B denotes the batch size, T denotes the max time frames inlogits, and K denotes the number of classes including a class for blanks.logit_paddings (
TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray)) – (B, T)-array. Padding indicators forlogits. Each element must be either 1.0 or 0.0, andlogitpaddings[b, t] == 1.0denotes thatlogits[b, t, :]are padded values.labels (
TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray)) – (B, N)-array containing reference integer labels where N denotes the max time frames in the label sequence.label_paddings (
TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray)) – (B, N)-array. Padding indicators forlabels. Each element must be either 1.0 or 0.0, andlabelpaddings[b, n] == 1.0denotes thatlabels[b, n]is a padded label. In the current implementation,labelsmust be right-padded, i.e. each rowlabelpaddings[b, :]must be repetition of zeroes, followed by repetition of ones.blank_id (
int) – Id for blank token.logits[b, :, blank_id]are used as probabilities of blank symbols.log_epsilon (
float) – Numerically-stable approximation of log(+0).
- Return type:
Tuple[TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray),TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray),TypeVar(ArrayType,Array,Variable,TrainVar,Array,ndarray)]- Returns:
A tuple ``(loss_value,logalpha_blank,logalpha_nonblank)`. Here,`loss_valueis a (B,)-array containing the loss values for each sequencein the batch,logalpha_blankandlogalpha_nonblankare(T,B,N+1)-arrays where the (t,b,n)-th element denoteslog alpha_B(t,n)andlog alpha_L(t,n),respectively,for ``b`-th`sequence in the batch.
References
[Graves et al, 2006](https://dl.acm.org/doi/abs/10.1145/1143844.1143891)