SenticGCNBertPreprocessor¶
-
class
SenticGCNBertPreprocessor
(tokenizer: Union[str, transformers.tokenization_utils.PreTrainedTokenizer] = 'bert-base-uncased', embedding_model: Union[str, transformers.modeling_utils.PreTrainedModel] = 'bert-base-uncased', config_filename: str = 'config.json', model_filename: str = 'pytorch_model.bin', spacy_pipeline: str = 'en_core_web_sm', senticnet: Union[str, Dict[str, float]] = 'https://storage.googleapis.com/sgnlp/models/sentic_gcn/senticnet.pickle', max_len: int = 85, device: str = 'cpu')[source]¶ Class for preprocessing sentence(s) and its aspect(s) to a batch of tensors for the SenticGCNBertModel to predict on.
-
__call__
(data_batch: List[Dict[str, Union[str, List[str]]]]) → Tuple[List[sgnlp.models.sentic_gcn.preprocess.SenticGCNBertData], List[torch.Tensor]][source]¶ Method to generate list of input tensors from a list of sentences and their accompanying list of aspect.
- Parameters
data_batch (List[Dict[str, Union[str, List[str]]]]) – list of dictionaries with 2 keys, ‘sentence’ and ‘aspect’. ‘sentence’ value are strings and ‘aspect’ value is a list of accompanying aspect.
- Returns
- return a list of ordered tensors for ‘text_indices’,
’aspect_indices’, ‘left_indices’, ‘text_embeddings’ and ‘sdat_graph’.
- Return type
Tuple[List[SenticGCNData], List[torch.Tensor]]
-