498 B
498 B
#Vocab To create vocab.txt file, run make_new_vocab.py
Prep dataset
prep_dataset_training: Format and split dataset, so it can be used for training. Adapt which dataset version to make!
train German FoodBERT
language_modeling
#Vocab Files: bert-base-german-cased_tokenizer.json: original bert-base-german-cased tokenizer file bert_vocab.txt: original bert-base-german-cased vocab used_ingredients: all ingredients in dataset vocab.txt: German FoodBERT vocabulary