Files
MasterarbeitCode/train_model
2021-04-11 23:28:41 +02:00
..
2021-04-11 23:28:41 +02:00
2021-04-11 23:28:41 +02:00
2021-04-11 23:28:41 +02:00
2021-04-11 23:28:41 +02:00

#Vocab To create vocab.txt file, run make_new_vocab.py

Prep dataset

prep_dataset_training: Format and split dataset, so it can be used for training. Adapt which dataset version to make!

train German FoodBERT

language_modeling

#Vocab Files: bert-base-german-cased_tokenizer.json: original bert-base-german-cased tokenizer file bert_vocab.txt: original bert-base-german-cased vocab used_ingredients: all ingredients in dataset vocab.txt: German FoodBERT vocabulary