12 lines
430 B
Markdown
12 lines
430 B
Markdown
# Prepare Dataset for Training
|
|
|
|
## Requirements
|
|
install spacy model:
|
|
|
|
python -m spacy download de_core_news_lg
|
|
|
|
## Steps
|
|
1. combine_craped_dataset_parts: Combine the parts of the dataset if needed.
|
|
Adapt Paths and number of dataset parts!
|
|
2. dataset_helpers: Clean ingredients and use only ingredients that are used more than 20 times.
|
|
3. dataset_instructions_helpers: Clean ingredients in steps. Separate sentences in steps. |