sgnlp.models.ufd.utils.create_train_embeddings

create_train_embeddings(cfg: sgnlp.models.ufd.data_class.UFDArguments, tokenizer: sgnlp.models.ufd.tokenization.UFDTokenizer, model: sgnlp.models.ufd.modeling.UFDEmbeddingModel)Dict[source]

Helper function to generate training dataset for supervised and unsupervised training.

Parameters
Returns

dictionary of dataset embeddings for supervised and unsupervised dataset

Return type

Dict