initial commit of project
This commit is contained in:
15
train_model/README.md
Normal file
15
train_model/README.md
Normal file
@@ -0,0 +1,15 @@
|
||||
#Vocab
|
||||
To create vocab.txt file, run **make_new_vocab.py**
|
||||
|
||||
# Prep dataset
|
||||
**prep_dataset_training**: Format and split dataset, so it can be used for training. Adapt which dataset version to make!
|
||||
|
||||
# train German FoodBERT
|
||||
**language_modeling**
|
||||
|
||||
|
||||
#Vocab Files:
|
||||
**bert-base-german-cased_tokenizer.json**: original bert-base-german-cased tokenizer file
|
||||
**bert_vocab.txt**: original bert-base-german-cased vocab
|
||||
**used_ingredients**: all ingredients in dataset
|
||||
**vocab.txt**: German FoodBERT vocabulary
|
||||
Reference in New Issue
Block a user