added to README files, added full dataset versions to data
This commit is contained in:
@@ -1,4 +1,4 @@
|
||||
#Vocab
|
||||
# Vocab
|
||||
To create vocab.txt file, run **make_new_vocab.py**
|
||||
|
||||
# Prep dataset
|
||||
@@ -8,7 +8,7 @@ To create vocab.txt file, run **make_new_vocab.py**
|
||||
**language_modeling**
|
||||
|
||||
|
||||
#Vocab Files:
|
||||
# Vocab Files:
|
||||
**bert-base-german-cased_tokenizer.json**: original bert-base-german-cased tokenizer file
|
||||
**bert_vocab.txt**: original bert-base-german-cased vocab
|
||||
**used_ingredients**: all ingredients in dataset
|
||||
|
||||
Reference in New Issue
Block a user