added dataset splitter to readme

2023-10-22 19:07:57 +02:00 · 2023-10-22 19:07:57 +02:00 · 39d30708ec
commit 39d30708ec
parent 0bd7b7809a
1 changed files with 13 additions and 1 deletions
--- a/README.md
+++ b/README.md
@ -113,7 +113,7 @@ $ python ./src/helpers/test/decision_tree.py -i ./out/result-(date/time).csv -o
 ### Template extraction
 > :warning: **Please note:** <br>
 > This tool uses the legacy format for datasets.<br>
-> Images are sorted using folders, instead of by name
+> Images are sorted using folders, instead of by name.

 1. Images should have four standard Aruco markers clearly visible
 2. Run the template extraction tool with an input directory as argument
@ -122,6 +122,18 @@ $ python ./src/experiments/template_extraction/script.py ./dataset
 ```
 3. The script generates new folders, ending with `_out`
 4. The paths to any failed images are saved in `skipped.txt`
+
+### Dataset splitting
+1. Ensure that the dataset is in `./res/dataset`
+2. Run the dataset splitter tool:
+```sh
+$ python ./src/experiments/dataset.py
+```
+3. Three new folders will be created, containing the following percentage of images:
+    - `./res/dataset/training`, 70%
+    - `./res/dataset/validation`, 20%
+    - `./res/dataset/training`, 10%
+4. Images are split pseudorandomly, thus will create the same datasets on different machines.
 ---

 Arne van Iterson<br>