initial commit of project
This commit is contained in:
12
clean_dataset/README.md
Normal file
12
clean_dataset/README.md
Normal file
@@ -0,0 +1,12 @@
|
||||
# Prepare Dataset for Training
|
||||
|
||||
## Requirements
|
||||
install spacy model:
|
||||
|
||||
python -m spacy download de_core_news_lg
|
||||
|
||||
## Steps
|
||||
1. combine_craped_dataset_parts: Combine the parts of the dataset if needed.
|
||||
Adapt Paths and number of dataset parts!
|
||||
2. dataset_helpers: Clean ingredients and use only ingredients that are used more than 20 times.
|
||||
3. dataset_instructions_helpers: Clean ingredients in steps. Separate sentences in steps.
|
||||
Reference in New Issue
Block a user