added to README files, added full dataset versions to data

This commit is contained in:
2021-04-15 20:19:09 +02:00
parent cf40ad15fb
commit 1ea0677029
9 changed files with 61 additions and 543 deletions

View File

@@ -0,0 +1,18 @@
Some parameters (model version, etc.) need to be adjusted in all scripts.
## Generate Substitute Recommendations
**generate_substitutes.py** is used to generate the substitute recommendations for each model using various scoring thresholds. Model version and scoring threshold need to be specified.
## Prepare Data for Evaluation
**find_ground_truth_ingredients.py** was used to find "rare" and "frequent" ingredients for the ground truth.
Ingredients for which no substitute recommendations are found need to be added to the substitute-JSON file. This is done using **add_unused_ingredients.py**
## Evaluation
An intermediate evaluation was done using **stats_engl_substitutes_compare.py** to gain insight into the various versions of the substitute recommendations. However, this script is not used for the final evaluation.
The ingredient substitute recommendations made using each FoodBERT version can be evaluated using **final_eval.py**.
The version that is to be used has to be adjusted in the first line of the main().
Stats for the dataset and the ground truth can be found using **dataset_stats.py** and **ground_truth_stats.py**, respectively.