Transformer

class Transformer(config, n_layers, d_model, n_heads)[source]
forward(query, key, val, key_structure=None, val_structure=None, attention_mask=None)[source]

This function takes in a sequence and apply MHA to it