RecconSpanExtractionTokenizer¶
-
class
RecconSpanExtractionTokenizer
(vocab_file: str, do_lower_case: bool = False, **kwargs)[source]¶ Constructs a Reccon Span Extraction tokenizer, derived from the Bert tokenizer.
- Parameters
vocab_file (
str
) – Path to the vocabulary file.do_lower_case (
bool
, defaults toFalse
) – Whether or not to lowercase the input when tokenizing.
Example:
from sg_nlp import RecconSpanExtractionTokenizer tokenizer = RecconSpanExtractionTokenizer.from_pretrained("mrm8488/spanbert-finetuned-squadv2") text = "Our company's wei-ya is tomorrow night ! It's your first Chinese New Year in Taiwan--you must be excited !" inputs = tokenizer(text, return_tensors="pt")