RecconSpanExtractionModel¶
-
class
RecconSpanExtractionModel
(config)[source]¶ The Reccon Span Extraction Model with a span classification head on top for extractive question-answering tasks like SQuAD.
This model is also a PyTorch torch.nn.Module subclass. Use it as a regular PyTorch Module and refer to the PyTorch documentation for all matter related to general usage and behavior.
- Parameters
config (
RecconSpanExtractionConfig
) – Model configuration class with all the parameters of the model. Initializing with a config file does not load the weights associated with the model, only the configuration. Use thefrom_pretrained
method to load the model weights.
Example:
from sgnlp.models.span_extraction import RecconSpanExtractionConfig, RecconSpanExtractionTokenizer, RecconSpanExtractionModel, utils # 1. load from default config = RecconSpanExtractionConfig() model = RecconSpanExtractionModel(config) # 2. load from pretrained config = RecconSpanExtractionConfig.from_pretrained("https://storage.googleapis.com/sgnlp/models/reccon_span_extraction/config.json") model = RecconSpanExtractionModel.from_pretrained("https://storage.googleapis.com/sgnlp/models/reccon_span_extraction/pytorch_model.bin", config=config) # Using model tokenizer = RecconSpanExtractionTokenizer.from_pretrained("mrm8488/spanbert-finetuned-squadv2") text = { 'context': "Our company's wei-ya is tomorrow night ! It's your first Chinese New Year in Taiwan--you must be excited !", 'qas': [{ 'id': 'dailydialog_tr_1097_utt_1_true_cause_utt_1_span_0', 'is_impossible': False, 'question': "The target utterance is Our company's wei-ya is tomorrow night ! It's your first Chinese New Year in Taiwan--you must be excited ! The evidence utterance is Our company's wei-ya is tomorrow night ! It's your first Chinese New Year in Taiwan--you must be excited ! What is the causal span from context that is relevant to the target utterance's emotion happiness ?", 'answers': [{'text': "Our company's wei-ya is tomorrow night ! It's your first Chinese New Year in Taiwan", 'answer_start': 0}]}]} dataset, _, _ = utils.load_examples(text, tokenizer) inputs = {"input_ids": dataset[0], "attention_mask": dataset[1], "token_type_ids": dataset[2]} outputs = model(**inputs)
-
forward
(**kwargs)[source]¶ The [BertForQuestionAnswering] forward method, overrides the __call__ special method.
<Tip>
Although the recipe for forward pass needs to be defined within this function, one should call the [Module] instance afterwards instead of this since the former takes care of running the pre and post processing steps while the latter silently ignores them.
</Tip>
- Parameters
input_ids (torch.LongTensor of shape (batch_size, sequence_length)) –
Indices of input sequence tokens in the vocabulary.
Indices can be obtained using [BertTokenizer]. See [PreTrainedTokenizer.encode] and [PreTrainedTokenizer.__call__] for details.
[What are input IDs?](../glossary#input-ids)
attention_mask (torch.FloatTensor of shape (batch_size, sequence_length), optional) –
Mask to avoid performing attention on padding token indices. Mask values selected in [0, 1]:
1 for tokens that are not masked,
0 for tokens that are masked.
[What are attention masks?](../glossary#attention-mask)
token_type_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –
Segment token indices to indicate first and second portions of the inputs. Indices are selected in [0, 1]:
0 corresponds to a sentence A token,
1 corresponds to a sentence B token.
[What are token type IDs?](../glossary#token-type-ids)
position_ids (torch.LongTensor of shape (batch_size, sequence_length), optional) –
Indices of positions of each input sequence tokens in the position embeddings. Selected in the range [0, config.max_position_embeddings - 1].
[What are position IDs?](../glossary#position-ids)
head_mask (torch.FloatTensor of shape (num_heads,) or (num_layers, num_heads), optional) –
Mask to nullify selected heads of the self-attention modules. Mask values selected in [0, 1]:
1 indicates the head is not masked,
0 indicates the head is masked.
inputs_embeds (torch.FloatTensor of shape (batch_size, sequence_length, hidden_size), optional) – Optionally, instead of passing input_ids you can choose to directly pass an embedded representation. This is useful if you want more control over how to convert input_ids indices into associated vectors than the model’s internal embedding lookup matrix.
output_attentions (bool, optional) – Whether or not to return the attentions tensors of all attention layers. See attentions under returned tensors for more detail.
output_hidden_states (bool, optional) – Whether or not to return the hidden states of all layers. See hidden_states under returned tensors for more detail.
return_dict (bool, optional) – Whether or not to return a [~utils.ModelOutput] instead of a plain tuple.
start_positions (torch.LongTensor of shape (batch_size,), optional) – Labels for position (index) of the start of the labelled span for computing the token classification loss. Positions are clamped to the length of the sequence (sequence_length). Position outside of the sequence are not taken into account for computing the loss.
end_positions (torch.LongTensor of shape (batch_size,), optional) – Labels for position (index) of the end of the labelled span for computing the token classification loss. Positions are clamped to the length of the sequence (sequence_length). Position outside of the sequence are not taken into account for computing the loss.
- Returns
A [transformers.modeling_outputs.QuestionAnsweringModelOutput] or a tuple of torch.FloatTensor (if return_dict=False is passed or when config.return_dict=False) comprising various elements depending on the configuration ([BertConfig]) and inputs.
loss (torch.FloatTensor of shape (1,), optional, returned when labels is provided) – Total span extraction loss is the sum of a Cross-Entropy for the start and end positions.
start_logits (torch.FloatTensor of shape (batch_size, sequence_length)) – Span-start scores (before SoftMax).
end_logits (torch.FloatTensor of shape (batch_size, sequence_length)) – Span-end scores (before SoftMax).
hidden_states (tuple(torch.FloatTensor), optional, returned when output_hidden_states=True is passed or when config.output_hidden_states=True) – Tuple of torch.FloatTensor (one for the output of the embeddings, if the model has an embedding layer, + one for the output of each layer) of shape (batch_size, sequence_length, hidden_size).
Hidden-states of the model at the output of each layer plus the optional initial embedding outputs.
attentions (tuple(torch.FloatTensor), optional, returned when output_attentions=True is passed or when config.output_attentions=True) – Tuple of torch.FloatTensor (one for each layer) of shape (batch_size, num_heads, sequence_length, sequence_length).
Attentions weights after the attention softmax, used to compute the weighted average in the self-attention heads.
- Return type
[transformers.modeling_outputs.QuestionAnsweringModelOutput] or tuple(torch.FloatTensor)
Example:
```python >>> from transformers import BertTokenizer, BertForQuestionAnswering >>> import torch
>>> tokenizer = BertTokenizer.from_pretrained("bert-base-uncased") >>> model = BertForQuestionAnswering.from_pretrained("bert-base-uncased")
>>> question, text = "Who was Jim Henson?", "Jim Henson was a nice puppet"
>>> inputs = tokenizer(question, text, return_tensors="pt") >>> with torch.no_grad(): ... outputs = model(**inputs)
>>> answer_start_index = outputs.start_logits.argmax() >>> answer_end_index = outputs.end_logits.argmax()
>>> predict_answer_tokens = inputs.input_ids[0, answer_start_index : answer_end_index + 1] >>> tokenizer.decode(predict_answer_tokens)
```python >>> # target is “nice puppet” >>> target_start_index, target_end_index = torch.tensor([14]), torch.tensor([15])
>>> outputs = model(**inputs, start_positions=target_start_index, end_positions=target_end_index) >>> loss = outputs.loss >>> round(loss.item(), 2)