ContextGate
Layer that calculates the values of the context gate to determine how much of the context to include in decoding the output token.
forward
Conv1dDecoder output. Shape of (batch size, sequence length, hidden dim).
Summarised representation of encoder states (ie, attention transformed context encoder outputs). Shape of (batch size, sequence length, hidden dim).