From 39d30708ec5e1f53ade8516e3ef5298f18e9b2d5 Mon Sep 17 00:00:00 2001 From: Tom Selier Date: Sun, 22 Oct 2023 19:07:57 +0200 Subject: [PATCH] added dataset splitter to readme --- README.md | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/README.md b/README.md index a890566..2cad4e2 100644 --- a/README.md +++ b/README.md @@ -113,7 +113,7 @@ $ python ./src/helpers/test/decision_tree.py -i ./out/result-(date/time).csv -o ### Template extraction > :warning: **Please note:**
> This tool uses the legacy format for datasets.
-> Images are sorted using folders, instead of by name +> Images are sorted using folders, instead of by name. 1. Images should have four standard Aruco markers clearly visible 2. Run the template extraction tool with an input directory as argument @@ -122,6 +122,18 @@ $ python ./src/experiments/template_extraction/script.py ./dataset ``` 3. The script generates new folders, ending with `_out` 4. The paths to any failed images are saved in `skipped.txt` + +### Dataset splitting +1. Ensure that the dataset is in `./res/dataset` +2. Run the dataset splitter tool: +```sh +$ python ./src/experiments/dataset.py +``` +3. Three new folders will be created, containing the following percentage of images: + - `./res/dataset/training`, 70% + - `./res/dataset/validation`, 20% + - `./res/dataset/training`, 10% +4. Images are split pseudorandomly, thus will create the same datasets on different machines. --- Arne van Iterson